Principal Site Reliability Engineering Lead
Company: Pacific Gas and Electric Company
Location: Oakland
Posted on: May 18, 2025
Job Description:
Requisition ID# 165427Job Category: Information TechnologyJob
Level: Manager/PrincipalBusiness Unit: Information TechnologyWork
Type: HybridJob Location: OaklandDepartment OverviewThe Data
Solutions Architecture Team at Pacific Gas & Electric Company is
responsible for driving long-term, enterprise-wide data solutions,
target state architecture, and overall excellence with the
application of data, analytics and information to critical business
challenges and opportunities. This team is chartered to develop the
strategy, roadmap, and accompanying standards that will enable
better use of data and information and to develop analytics
maturity at PG&E.Position SummaryThe Digital Utility runs on
data and information. At PG&E, we have many teams building data
products that need support, and our operations teams are on the
hook for ensuring reliability and support across all data products.
The Principal Site Reliability Engineering Lead fills a critical
role in empowering our operations teams to do their best work.The
Principal Site Reliability Engineering Lead will drive our
operations strategy in DA&I, working with operations teams with
implementing best practices, mentoring junior engineers, driving
automation, and building a continuously improving operations
practice. You will work with operations management and operations
engineers to create scalable DevOps practices for key data
platforms at DA&I, notably Palantir Foundry, Snowflake, and
Informatica. You will also get hands-on with operational problems,
and building out operations tooling for the team.We strive for a
team that will make a difference in the new PG&E. As Site
Reliability Engineering Lead, you will have a direct impact on the
day-to-day life of data solutions, delivery, and affect the Safety
of California. You will be collaborating with other technical
leaders and Executive Leadership to help reshape a first-class
operations team, with high levels of reliability for the data
products we, and our customers, rely on the most. As Site
Reliability Engineering Lead, you will work closely with supportive
Operations management, a talented team in need of your guidance,
and an organization looking to you to support their key
products.The Principal Site Reliability Engineering Lead will
report to the Senior Manager of Data Solutions Architecture in the
Data Analytics & Insights department of Information Technology, and
work closely with the Data Ecosystem Operations team.
PG&E is providing the salary range that the company in good
faith believes it might pay for this position at the time of the
job posting. This compensation range is specific to the locality of
the job. The actual salary paid to an individual will be based on
multiple factors, including, but not limited to, specific skills,
education, licenses or certifications, experience, market value,
geographic location, and internal equity. We would not anticipate
that the individual hired into this role would land at or near the
top half of the range described below, but the decision will be
dependent on the facts and circumstances of each case.
A reasonable salary range is:Bay Area Minimum: $155,000.00Bay Area
Maximum: $265,000.00Job Responsibilities
- Technical Support and Collaboration: Provide applications
engineering support to product teams. Collaborate with product
teams, support teams, and customers on shared goals, cross-team
projects, and new initiatives.
- Continuous Improvement and Reliability Practices: Strive for
continuous improvement in processes and reliability practices.
Develop and evolve improved operations workflows.
- Leadership and Mentoring: Show teams how to improve quality and
eliminate waste by implementing improvements with them.
- Hands-on Troubleshooting: As a member of the Operations team,
you will join them on-call and be available to help with escalated
issues, or issues requiring your additional experience and steady
hand.
- Operations tooling: You will build tools for improved
operational workflows in collaboration with, and leading, members
of the Operations team.
- Efficiency: Identify wasteful processes and procedures. Work
with teams to streamline and automate tasks.
- Performance Monitoring and Improvement: Monitor, measure, and
enhance the performance and state-awareness of systems. Identify
and drive improvements in infrastructure and system reliability,
performance, and monitoring.
- Root Cause Analysis and Investigation: Lead investigations into
repetitive damage and failure rates, utilizing root cause analysis
techniques. Implement corrective and preventive actions based on
findings.
- Reliability and Capital Planning: Participate in annual and
long-term reliability planning, ensuring alignment with operational
objectives. Contribute to the development and execution of life
cycle asset management processes.
- Architecture: Own the Information Architecture and related
Technical Architecture for the Operations sub-domain of the Data &
Information Architecture domain.
- Technology Life Cycle: Develop and execute strategies to
introduce new capabilities needed, evolve and mature existing
capabilities, and retire capabilities at their end of life.
- Documentation and Governance: Develop and maintain
architectural guidance documents and artifacts, practices and
procedures, and governance to support the above.
- Strategic Planning: Support technology strategy, planning, and
road mapping activities across IT and at the enterprise level.
- Data Analysis and Predictive Modeling: Perform statistical data
analysis. Utilize data insights for capacity planning, demand
forecasting, and identifying performance
bottlenecks.QualificationsMinimum:
- Bachelors Degree in Computer Science or job-related discipline
or equivalent experience
- 7 years of relevant work experience in Information Technology,
Data Management, Business Intelligence, and Analytics, to include
experience in both IT and line of business departments
Desired:
- Experience working directly with line of business stakeholders
demonstrating job-related skills.
- 5 or more years experience with Site Reliability
Engineering/DevOps practices.
- Experience with analytics and data management principles such
as: data acquisition and modeling, data warehousing, business
intelligence, metadata management, master data management, advanced
analytics and data science, "big data" techniques,
public/hybrid/private cloud data management and analytics services
data security, and data and analytics governance.
- Ability to achieve a deep understanding of line of business
strategies, priorities, needs, and current capabilities.
- Ability to work collaboratively to engage and influence
business and IT stakeholders, senior leadership and external
partners.
- Customer management and negotiation skills that enable the
ability to mediate opposing viewpoints and articulate the
advantages of a preferred solution.
- Excellent written and oral communication skills across all
levels; ability to communicate complex technical concepts to
leaders, business sponsors and stakeholders in clear, concise
language that inspires confidence and earns trust.
- Strong leadership skills in the technology and operations
domain and a high level of drive, initiative and
assertiveness.
- Extensive experience with SRE/DevOps practices and tooling
- At least 3 years experience developing operations automation
tools in Python or another high level scripting language commonly
used on Unix systems.
- Familiarity with at least two or more of: Scaled Agile, Scrum
development methodology, DevOps/DevSecOps, LEAN, Six Sigma or ITIL
practices.
- Experience with any of the following: Data Architecture,
Airflow, Palantir Foundry, Informatica, Spark, Snowflake, Teradata,
and other database and BI technologies, data access languages such
as SQL, SAS, R, Python, Scala, etc.
- Experience working in the Utility Industry and a working
knowledge of Utility concepts and challenges a plus.
Keywords: Pacific Gas and Electric Company, Vallejo , Principal Site Reliability Engineering Lead, Professions , Oakland, California
Didn't find what you're looking for? Search again!
Loading more jobs...