Full Job Description
Job Description
Job Summary –
- 10+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
- 10+ years of experience in Production support/Site Reliability Engineering teams with continued focus on improving Platform health
- Familiar with Agile or other rapid application development practices
- Hands-on expertise with Automated testing, Process Automation & building dashboards using APM tools.
- Experience with distributed (multi-tiered) systems, algorithms, hands-on exp with Oracle and MongoBD databases.
- Knowledge & Exposure caching tools (Redis, memcache) or messaging tools such as MQ, Kafka.
- Must have working knowledge of APM tools such as splunk, GCL, ELK, Grafana, Prometheus etc.
- Able to create Dashboards using GCL/Splunk/ELK and setup s.
- Working knowledge of CICD is a plus – Source control like Git, Continuous Integration – Jenkins / UCD Release etc. .
- Ability to work with Engineering teams across the ecosystem such as Security, Networking & Infrastructure challenges which can impact platform health & resiliency.
- Shell Scripting / DevOps tools like Ansible with good knowledge of yaml file to write playbooks .
- Experience with distributed storage technologies like NFS as well as dynamic resource management frameworks PCF, Kubernetes / OpenShift, AWS or Azure.
- Tech Stack: Java/J2EE (Spring, Spring Boot, Python, Shell Scripting, Kafka, Oracle, MongoDB etc.).
- Able to work on shift duty in a 12/7 support organization.
- A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.
- Bachelor’s Degree in computer science, computer science engineering, or related experience required;
Desired Qualifications:
- Recognizes opportunities to adopt innovative technologies AI/ML to enable business capabilities
- Keeps up to date on current research and technology in the industry
- Recognizes the importance of collaboration to achieve objectives
- Clearly communicates ideas and concepts to others
- Leads work effectively and acts on own initiative without being prompted
- Provides feedback to team members in code reviews
- Drive creative changes & continuous improvements
- Explores new automation techniques to refine the agility, speed and quality of engineering initiatives and efforts
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Balance feature development speed and reliability with well-defined service level objectives
Job Expectations:
- You will be a core member of a SRE support team, will be utilizing the latest technology tools to write code, test cases, working with API specs and automate to maintain the resiliency, performance and availability of Digital Sales & Marketing platforms.
- Strong & relevant experience in supporting Web/API platforms built using Java/java script Stack (Spring/Spring boot, Javascript -Angular/react)
- Proficiency in dealing with Legacy infrastructure along with cloud infrastructure (on prem & 3rd party) such as OCP, PCF or Azure.
- Identifying opportunities to adopt to new technologies while improving the efficiency by removing toil and continues to drive efficiency & optimization.
- Proactive monitoring of app performance through Splunk, App dashboards, App dynamics & Dynatrace etc.
- Represent Platform engineering teams during production outages and collaborate with engineering teams to resolve production outages. Collaborate with stake holders across engineering function to own/derive RCA & work towards permanent resolution.
- Plan, support, execute and comply with governance programs/processes in support of a strong control environment in your functional area. Leverage process documentation to improve operational controls and identify and remediate process deficiencies.
- Proactively identify, communicate, mitigate and escalate risk originating from non-compliance of processes, operational errors, and data integrity issues in all applicable processes.
- Ability to influence SRE practices within and outside teams to enable a strong DevOps culture within the organization
- Able to work on shift duty in a 12/7 support organization.
- Responsible for working with Engineering teams to maintain the SLAs & SLOs. Constantly looking out for opportunities to improve platform metrics & communicate the same to stakeholders.
- Exposure and proficiency in different API styles such as SOAP, REST, Micro services etc.
- Working knowledge of Unix, Linux and Postman
- This will be a lead/Senior engineer position where you will play a role of mentor/leader to coach other engineers in enabling SRE function
Skills
PRIMARY COMPETENCY : DevOps PRIMARY SKILL : SRE PRIMARY SKILL PERCENTAGE : 60 SECONDARY COMPETENCY : Java Technologies SECONDARY SKILL : Java - REST API Development SECONDARY SKILL PERCENTAGE : 40
Job Information
Job Category:
Information Technology
Spotlight
Employer
Related jobs
Biomedical Laboratory Equipment Repair Technician
Atlantic Lab Services, Inc.
Biomedical Laboratory Equipment Repair Technician:
Perform calibration, preventative maintenance, inspection, and repair of laboratory equipment including Beckman Coulter, Olympus AU series, Roche, Ab...
Apr 27, 2025
Las Vegas, NV
Technical Director of Coaching
ALBION SC
Technical Director of Coaching:
ALBION SC Las Vegas, a premier youth soccer club dedicated to player development and competitive success, is currently seeking a full-time Technical Director of Coachin...
Apr 27, 2025
Las Vegas, NV
Housekeeper
Confidential
HOUSEKEEPER
Owner of an executive home in Las Vegas, NV is seeking a housekeeper. This is a part-time position paying $30.00 per hour, 20 hours per week.
Duties include light housekeeping, shopping, s...
Apr 27, 2025
Las Vegas, NV