Infrastructure Reliability Engineer
As a member of Advisor Services Technology, Service Availability and Engineering team, you will be immersed in a collaborative, innovative, and technically challenging environment. The platforms we support are essential to the success of the investment advisor and offer features such as client relationship management, portfolio accounting, trading, data integration, reporting and much more.
We are looking for skilled candidates who are enthusiastic about learning new and existing technologies in order to deliver exceptional solutions for the production resiliency of the systems. The role incorporates aspects of software engineering and operations. We are combining these skills to come up with better ways of managing and operating applications. The role will require a high level of responsibility and accountability yet has a support structure necessary for development growth. The successful candidate will have a proven track record of successfully building and supporting enterprise applications.
What youre good at
- Practice Site Reliability Engineering mindset and solve problems through automation and instrumentation.
- Partner with the Architects, Dev Leads and other SREs in the team, to ensure implementations are architected and designed from the aspect of production resiliency
- Identify opportunities to build innovative tools and solve unique operations problems on large enterprise and mission critical applications
- Develop tools, frameworks, and instrumentation to validate and increase rollout success for applications.
- Partner within the Support organizations to build and rollout plans for enhanced telemetry, and reduce defect leakage for software delivery to production.
- Real-Time troubleshooting of mission critical application workflows and incorporate feedback to product development.
- Work closely with dev teams during design phase, build and perform infrastructure upgrades to support our applications availability and reliability.
- Monitor the current-state solution portfolio to identify deficiencies through aging of the technologies used by the application, or misalignment with business requirements.
- Understand, advocate and augment the Schwab Reliability Engineering principles, guidelines and standards
- Analyze the business-IT environment (run, grow and transform the business) to detect critical deficiencies, and recommend solutions for improvement
- Assist with the evaluation and selection of software product standards and services, as well as the design of standard and custom software configurations.
What you have
- Proven track record supporting production application development and support efforts adhering to a mix of DevOps & SRE frameworks
- Ability to grasp difficult concepts, large architectures, and sophisticated designs quickly
- Progressive experience supporting highly available, mission critical environments, experience leveraging tools to instrument and automate proactive and eventually predictive availability solutions
- Ability to understand multiple technologies and how they inter-relate and integrate
- Proven capability to provide operational visibility on environment health to technology and business partners
- Strong automation, innovation, and problem-solving skills
- Receptive, approachable teammate, with the ability to positively interact with business partners, technology teams, recruiting personnel, offshore, and professional services
- Strong customer advocate with good written and verbal communication skills
- Flexibility to participate in oncall support rotation
- Extensive experience in Enterprise level Infrastructure orchestration with Ansible, Chef, SALT, Puppet
- Experience in High Availability and distributed systems, Linux and Windows administration, troubleshooting and support
- Experience transitioning platforms to the cloud, good understanding of cloud technologies –GCP, AWS, Azure, PCF
- Experience with Atlassian tools Jira, Confluence, Bamboo, BitBucket
- Working knowledge of Monitoring tools - Splunk, Zenoss, Elastic, Appdynamics, Dynatrace
- Knowledge of networking including DNS, DHCP, firewalls, load balancers and IP routing
- Familiarity with one or more databases- Oracle, SQL Server, Mongo DB
- Preferred experience with C#, .Net, and scripting
- Excellent debugging skills across a variety of integrated platforms
- BS in computer science or related technical field with at least 5 years of experience with listed technical skills.