Senior Data Scientist
Remote – USA
Remote – Canada
Full time
Calix provides the cloud, software platforms, systems and services required for communications service providers to simplify their businesses, excite their subscribers and grow their value.
This is a remote-based position that can be located anywhere in the United States or Canada.
Our Products Team is growing and we’re looking for an innovative and experienced Data Scientist to lead and contribute to our Generative AI initiatives. In this role, you will work on developing advanced generative models for text, image, and other data types, leveraging state-of-the-art machine learning techniques. You will be a key player in driving the research, development, and optimization of generative AI models, working alongside machine learning engineers, AI researchers, and data engineers to bring these models into production.
Key Responsibilities:
- Develop Advanced Generative Models: Research, design, and implement generative models (e.g., GPT, GANs, VAEs, Transformer architectures) for a variety of tasks, including text generation, image synthesis, and creative AI applications.
- Define platform requirements: Formulate, design, and build platform components. Experienced scientists that are typically in lead role overseeing multiple use cases. They can clearly specify what a platform/solution will need.
- Experimentation and Prototyping: Design and conduct experiments to explore new algorithms and approaches for enhancing generative model performance. Prototype models and test them against established benchmarks.
- Model Training and Optimization: Lead the training and fine-tuning of large-scale AI models, optimizing them for specific use cases. Use advanced techniques such as transfer learning, reinforcement learning, and self-supervised learning to enhance performance.
- Data Analysis and Feature Engineering: Collaborate with data engineers and analysts to preprocess large datasets, extract features, and create high-quality training data for generative AI models. Ensure data quality and integrity throughout the pipeline.
- Performance Evaluation and Iteration: Evaluate model outputs using metrics such as FID, BLEU, or perplexity, and iteratively refine models based on quantitative feedback. Use statistical methods and A/B testing to assess model performance in production environments.
- Collaborate Across Teams: Work closely with machine learning engineers, software developers, and AI researchers to integrate models into production systems. Help define the technical roadmap and prioritize tasks aligned with business objectives.
- Drive Research and Innovation: Stay up-to-date with the latest research in generative AI, machine learning, and deep learning, and apply new advancements to improve models and systems.
- Mentorship and Leadership: Provide technical mentorship and guidance to junior data scientists and machine learning engineers, fostering a culture of innovation, collaboration, and continuous learning.
Qualifications:
- Bachelor’s, Master’s, or Ph.D. in Data Science, Computer Science, Machine Learning, Statistics, or a related field.
- 5+ year focus on delivering use cases or core AI/ML platform components.
- 8+ years focus on quantitative, analytics work with data.
- Solid background in Quantitative and Advanced mathematics: Applied Mathematics, Quantitative Economics, Statistics.
- Experience working with domain experts and translate knowledge and requirements into AI products.
- Strong background applying Statistical Learning to interpret, evaluate and optimize outcomes and familiar with metrics and their applications to use cases.
- Experience preparing and synthetic datasets for training, testing and validation.
- Proven experience with deep learning models, including transformer architectures, GANs, VAEs, or similar generative techniques.
- Strong track record in building, training, and deploying machine learning models for real-world applications.
- Expertise in Python and deep learning frameworks such as TensorFlow, PyTorch, or Keras.
- Strong understanding of NLP techniques (e.g., GPT, BERT) and/or computer vision models (e.g., GANs, VAEs).
- Experience with data processing, feature engineering, and working with large-scale datasets.
- Proficiency with SQL and experience with data pipeline tools (e.g., Apache Spark, Hadoop, Airflow).
- Familiarity with MLOps practices for model deployment and monitoring.
- Privacy and compliance, worked on large scale data, deployment and maintaining models in production, experience implementing/optimizing ML algorithms.
- Strong problem-solving skills and ability to work in a fast-paced, iterative environment.
- Excellent communication skills, both written and verbal, with the ability to present technical concepts to non-technical stakeholders.
- Proven leadership skills and ability to mentor and guide team members.
Preferred Skills:
- Experience with multimodal models (text, image, video) in generative AI applications.
- Familiarity with Reinforcement Learning, Self-Supervised Learning, or Few-shot Learning.
- Knowledge of cloud computing platforms (AWS, GCP, Azure) and experience deploying models in cloud environments.
- Experience publishing in AI/ML conferences or journals, and active participation in AI research communities.
- GenAI: RAG pipeline components, LLM pre-training, alignment, fine tuning, different types of LLM and their applications.
Compensation will vary based on geographical location (see below) within the United States. Individual pay is determined by the candidate’s location of residence and multiple factors, including job-related skills, experience, and education.
For more information on our benefits click here.
There are different ranges applied to specific locations. The average base pay range (or OTE range for sales) in the U.S. for the position is listed below.
San Francisco Bay Area Only:
133,400.00 – 226,600.00 USD Annual
All Other Locations:
116,000.00 – 197,000.00 USD Annual