ML Data Linguist – English and Japanese, AWS AI Data

Title: ML Data Linguist – English and Japanese, AWS AI Data | Plato

Location: WA-Seattle

Job Description: Description

Amazon Web Services (AWS) is looking for a data associate to help with annotations and data analysis. As part of the AI Data Team at AWS you will responsible for delivering high-quality training data to ensure the best performance of the AWS machine learning systems, including LLMs. Our goal is to produce the highest quality training data in the industry and to delight our customers by improving human language understanding and natural language processing.

This role focuses on English speech and language data, primarily in the areas of speech transcription, text annotation, and other general development of high quality language data deliverables. This role will be working closely with Language Engineers to create data sets for model training, benchmarking, and evaluation, including annotation of The successful candidate must have background in analyzing language data and a passion for efficiency and accuracy.

The AI Data Team at AWS is responsible for delivering high-quality annotated data and a variety of language artifacts to ensure the best performance of different AWS machine-learning language services. These ML-based language services enable customers to readily add intelligence to their business operations and AI applications to drive positive outcomes.

Key job responsibilities

– Build a thorough understanding of data collection and annotation guidelines and various annotation tools.

– Annotate natural language data in English and Japanese accurately within deadlines, adhering to guidelines.

– Participate in data generation, collection and quality assurance tasks

– Dive deep into the data to perform qualitative error trend analysis.

– Handle unique data collection and analysis requests for different NLP/NLU applications.

– Collaborate with other ML Data Linguists to resolve data ambiguities and annotation disagreements.

– Provide feedback to Language Engineers on annotation guidelines, tooling, and processes to drive improvements.

– Diving deep into issues and implement solutions independently

– Contribute to process improvements to reduce handling time and improve resource output.

– Develop a variety of language artifacts crucial for model development such as datasets for training and evaluation.

About the team

AWS Utility Computing (UC) provides product innovations – from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS's services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services.

Diverse Experiences

AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.

Why AWS?

Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating – that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.

Inclusive Team Culture

Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.

Mentorship & Career Growth

We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.

Work/Life Balance

We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.

Hybrid Work

We value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our U.S. Amazon offices.

Basic Qualifications

– Bachelor’s degree in Linguistics, Communication, Cognitive Science, or a related field, with background in phonetics, semantics, pragmatics, conversation analysis, and/or discourse analysis.

– At least 6 months of experience with natural language data labeling and other forms of data markup.

– Native or near-native proficiency in Japanese and English (US) (CEFR C1 or above).

– Excellent communication, strong organizational skills with a keen eye for details.

– Comfortable working in a fast-paced, highly collaborative, and dynamic work environment.

– Willingness to support several projects at one time, and to accept re-prioritization as necessary.

Preferred Qualifications

– Ability to quickly learn new guidelines, technical concepts, and softwares.

– Familiarity with command line interfaces and basic Unix commands.

– Proficiency in additional languages, such as German, Chinese, Spanish, French, Portuguese.

– Familiarity with common text processing tools.

– Working knowledge of a variety of file formats and mark up languages (e.g. JSON, XML, HTML).

– Passion for language, linguistics, human language technology and AI.

– Basic to intermediate scripting skills in one or more of the common programming languages (python, HTML, java script)

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit

Pursuant to the Los Angeles Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $32,700/year in our lowest geographic market up to $70,000/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit This position will remain posted until filled. Applicants should apply via our internal or external career site.