Sr Data Scientist

Cloud BC Labs

  • Reston, VA
  • Permanent
  • Full-time
  • 8 hours ago
  • Apply easily
Note: PhD preferred or Atleast Masters required.Role: Sr Data ScientistDuration: LongtermLocation: Reston, VAJob DescriptionMinimum Qualifications:
  • Work or educational background in one or more of the following areas: machine learning, computational linguistics, deep learning, ratification intelligence, data science and/or data analytic, generative AI, symbolic AI, causal AI, operations research, computer science, Mathematics, business analytics, or knowledge management.
  • Demonstrated experience programming with R/Python, Linux, and Spark in AWS cloud environment, or knowledge and algorithmic design experience in Python (3+ years)
  • Proficient with Amazon AWS Sagemaker, Jupyter Notebook and Python Scikit, Deep Learning, Machine Learning tools such as TensorFlow
  • Experience with image processing models such as Coco, CLIP, ResNet or comparable models
  • Demonstrated experience with machine learning techniques including natural language processing, and Large language Models (GPTv4-o1, o3, OpenAI APIs, Llama, Claude, etc).
  • Experience developing AI agents and development proficiency using agentic programming
  • Proficient in Natural language processing (NLP) and Natural language generation (NLG) including prior projects in any of the following categories: top modeling of text, sentiment analysis of text, part of speech tagging, Name Entity Recognition (NER), Bag of Words, text extraction
  • Experience building and working with any of these components: Vector DB, BERT, RoBERTa (or comparable tools), Spacy, LLM and GenAI tools. Experience with LoRA, LangChain, RAG, LLM Fine Tuning and PEFT, Knowledge Graphs.
  • Strong skills in developing GraphRAG, Chain of Thought (CoT), Tree of Thought (ToT), Reinforcement learning and AI development architectures with Human-in-the-Loop (HITL
  • Demonstrated experience with SQL and any relational database technologies, such as Oracle, PostgreSQL, MySQL, RDS, Redshift, Hadoop EMR, Hive, etc.
  • Demonstrated experience processing structured and unstructured data sources, data cleansing, data normalization and prep for analysis
  • Demonstrated experience with code repositories and build/deployment pipelines, specifically Jenkins and/or Git/GitHub/GitLab.
  • Demonstrated experience using Tableau, or Kibana, Quicksights or other similar data visualizations tools.
  • Very comfortable working with ambiguity (e.g. imperfect data, loosely defined concepts, ideas, or goals)
Qualifications & Requirements
  • Education: MS in Computer Science, Statistics, Math, Engineering, or related field, PhD preferred
  • 3+ years of relevant experience in building large scale machine learning or deep learning models and/or systems
  • 1+ year of experience specifically with deep learning (e.g., CNN, RNN, LSTM)
  • 1+ year of experience building NLP and NLG tools.
  • Experience with wide range of LLMs (Llama, Claude, OpenAI, Cohere, etc.), LoRA, LangChain, RAG, LLM Fine Tuning and PEFT are preferred.
  • Demonstrated skills with Jupyter Notebook, AWS Sagemaker, or Domino Datalab or comparable environments
  • Passion for solving complex data problems and generating cross-functional solutions in a fast-paced environment
  • Knowledge in Python and SQL, object oriented programming, service oriented architectures
  • Strong scripting skills with Shell script and SQL
  • Strong coding skills and experience with Python (including SciPy, NumPy, and/or PySpark) and/or Scala.
  • Knowledge and implementation experience with NLP techniques (topic modeling, bag of words, text classification, TF/IDF, Sentiment analysis) and NLP technologies such as Python NLTK, or Spacy or comparable technologies
  • Knowledge and implementation experience with statistical and machine learning models (regression, classification, clustering, graph models, etc.)
Preferred Qualifications
  • Hands on experience building models with deep learning frameworks like Tensorflow, Keras, Caffe, PyTorch, Theano, H2O, or similar
  • Experience with LLM Agents, Agentic programming
  • Experience with search architecture (for instance: Solr, ElasticSearch, AWS OpenSearch)
  • Experience with building querying ontologies such as Zeno, OWL, RDF, SparQL or comparable are preferred
  • Knowledge & experience with microservices, service mesh, API development and test automation are preferred
  • Demonstrated experience using Docker, Kubernetes, and/or other similar container frameworks are preferred
  • Strongly prefer a PhD in math, computer science stat or comparable field with experience in data science, AI development and deep learning, advanced analytics
Additional Job Qualifications:
  • Ability to translate business ideas into analytics models that have major business impact.
  • Demonstrated experience working with multiple stakeholders.
  • Demonstrated communication skills, e.g. explaining complex technical issues to more junior data scientists, in graphical, verbal, or written formats.
  • Demonstrated experience developing tested, reusable and reproducible work.
  • Transparently documenting code and methodologies.
  • Ability to work in Agile, Lean and rapid development processes
Cloud BC Labs Inc is a digital transformation organization aimed at creating seamless solutions for clients to effectively manage their business operations. The company specializes in Business and Management Consulting, AI/ML, Data Analytics & Visualization, Cloud Data Warehouse Migration, Snowflake Implementation, Informatica Implementation & Upgrade, Staffing Services and Data Management Solutions

Cloud BC Labs