
Data Scientist I/ II/ Sr
- Houston, TX
- Permanent
- Full-time
- Design, build, and deploy predictive models and machine learning (ML) algorithms to support asset performance, reliability, and commercial optimization.
- Conduct exploratory data analysis (EDA), feature engineering, and statistical modeling using Python, R, or similar tools.
- Develop time series forecasting, anomaly detection, and classification models for operational and business use cases.
- Apply geospatial and sensor data analytics to support pipeline monitoring, flow optimization, and risk assessment.
- Leverage Amazon Web Services (AWS) tools such as SageMaker, Redshift, Simple Storage Service (S3), Lambda, and Glue for model training, deployment, and data access.
- Collaborate with data engineers to ensure models are integrated into production pipelines and dashboards.
- Use version control (e.g., Git) and continuous integration/continuous deployment (CI/CD) practices to manage model lifecycle and reproducibility.
- Partner with operations, engineering, and commercial teams to identify high-impact use cases and translate business needs into analytical solutions.
- Present findings and recommendations through compelling data visualizations and storytelling using tools like Power BI or Tableau.
- Support the development of self-service analytics and promote data literacy across the organization.
- Document model assumptions, limitations, and performance metrics to ensure transparency and trust.
- Monitor model performance and retrain as needed to maintain accuracy and relevance.
- Implement MLOps practices to support scalable, automated model deployment and monitoring.
- Ensure compliance with data governance, privacy, and ethical AI standards.
- Stay current with industry trends and emerging technologies to continuously improve analytical capabilities.
- Bachelor’s or Master’s degree in Data Science, Computer Science, Engineering, Statistics, or a related field.
- 3–5+ years of experience in applied data science, preferably in the energy, utilities, or industrial sectors.
- Proficiency in Python, SQL, and data science libraries such as pandas, scikit-learn, TensorFlow, or PyTorch.
- Experience working with AWS services including SageMaker, Redshift, S3, and Lambda.
- Strong understanding of statistical modeling, machine learning, and data visualization techniques.
- Ability to communicate complex technical concepts to non-technical stakeholders.
- Experience working with large datasets and real-time or near-real-time data environments.
- Experience in the natural gas midstream or broader oil & gas industry.
- Familiarity with MLOps practices and tools for model monitoring and retraining.
- Exposure to geospatial data, sensor data, or SCADA systems.
- Experience with Power BI, Tableau, or similar BI tools.
- Knowledge of data governance, security, and compliance in cloud environments.
- Bachelor’s degree in Computer Science, Engineering, Statistics, or related field
- Master’s degree