We're building high-quality evaluation and training datasets to improve how Large Language Models (LLMs) interact with realistic software engineering tasks. You will have the opportunity to work on a diverse range of projects from helping models traverse complex code bases to building agents that improve model performance.
Role Overview -- What Does a Typical Day Look Like?
Work across multiple different projects to improve LLM performance on code: sample projects
Leading and delivering end-to-end agent use cases such as home automation agents, coding copilots, or creative design assistants.
Collaborate with the team to identify edge cases and ambiguities in model behavior.
Review and compare 3-4 model-generated code responses per task using a structured ranking system.
Evaluate code diffs for correctness, code quality, style, and efficiency. Provide clear, detailed rationales explaining the reasoning behind each ranking decision.
Required Skills & Experience
Several years of software engineering experience, including 2+ continuous years at a top-tier product company (e.g., Google, Stripe, Amazon, Apple, Meta, Netflix, Microsoft, Datadog, Dropbox, Shopify, PayPal, IBM Research).
Strong expertise in building full-stack applications and deploying scalable, production-grade software using modern languages and tools.
Deep understanding of software architecture, design, development, debugging, and code quality/review assessment.
Proven ability to review code diffs and evaluate correctness, maintainability, and efficiency.
Excellent oral and written communication skills for clear, structured evaluation rationales.
Engagement Details
Commitment: flexible engagement, minimum 10 hrs/week, up to 40 hrs/week (partial PST overlap required).
Type: Contractor (no medical/paid leave).
Duration: 1 month potential extensions based on performance and fit.
Job Type: Seasonal
Pay: $80.00-$85.00 per hour
Expected hours: 40 per week
Language:
* English (preferred)
Beware of fraud agents! do not pay money to get a job
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.