Prompt Engineer

AI LabRemote (US)Full-time$130,000 - $190,000

Syntas is looking for a Prompt Engineer to specialize in the art and science of getting the best possible outputs from large language models. You will design prompting strategies, build evaluation frameworks, implement systematic prompt optimization processes, and help our clients extract maximum value from their AI investments through better instructions, examples, and system architectures. This role sits at the intersection of linguistics, psychology, and software engineering. You will develop deep expertise in how different models respond to various prompting techniques, create reusable patterns that can be adapted across use cases, and build the tooling and processes that make prompt development more scientific and less guesswork.

Apply for this Position

About the Role

As a Prompt Engineer at Syntas, you will be the specialist who makes AI systems actually work well in production. While many engineers can get a demo working with a simple prompt, you understand the gap between demo and production—the edge cases, the failure modes, the consistency requirements—and you know how to close that gap through systematic prompt engineering.

Your work begins with understanding what the AI system needs to accomplish and how success will be measured. You will analyze the types of inputs the system will receive, the variations and edge cases it must handle, and the quality standards outputs must meet. From this understanding, you will design prompting strategies that might include few-shot examples, chain-of-thought reasoning, structured output formats, self-consistency techniques, or multi-step workflows with verification.

Evaluation is central to your role. You will build frameworks that measure prompt performance across relevant dimensions—accuracy, consistency, latency, cost, safety—and use these measurements to systematically improve. You will implement A/B testing infrastructure, create golden datasets for regression testing, and design LLM-as-judge evaluation chains that can assess quality at scale. Your approach to prompt engineering is empirical: hypotheses are tested, results are measured, and iterations are driven by data.

You will also work on the tooling and infrastructure that supports prompt engineering at scale. This includes prompt versioning systems, experiment tracking, production monitoring for prompt performance, and documentation frameworks that capture the reasoning behind prompt design decisions. You will create processes that let teams iterate on prompts safely, with appropriate testing and rollback capabilities.

What You Will Build

1
Prompting strategies for complex use cases including multi-step reasoning, tool use, and agentic workflows
2
Evaluation frameworks that measure prompt quality across accuracy, consistency, latency, and cost dimensions
3
Few-shot example libraries and retrieval systems that dynamically select relevant examples for each query
4
Prompt versioning and experimentation infrastructure using tools like Langfuse for systematic optimization
5
LLM-as-judge evaluation chains for automated quality assessment at scale
6
Production monitoring systems that track prompt performance and surface degradation or drift
7
Documentation and best practices guides that codify effective prompting patterns for reuse

Key Responsibilities

Design prompting strategies for diverse client use cases across industries and applications
Build evaluation frameworks that measure prompt performance against defined quality criteria
Implement systematic prompt optimization using A/B testing, ablation studies, and iterative refinement
Create few-shot example libraries and develop strategies for example selection and retrieval
Develop structured output prompts using JSON mode, function calling, and constrained generation
Design chain-of-thought and multi-step prompting strategies for complex reasoning tasks
Implement prompt versioning, experiment tracking, and production deployment workflows
Build LLM-as-judge evaluation systems for scalable quality assessment
Analyze prompt failures and edge cases to identify improvement opportunities
Collaborate with engineering teams to integrate prompting best practices into application development
Document prompting patterns, evaluation methodologies, and lessons learned for knowledge sharing
Stay current with prompting research, new model capabilities, and emerging techniques
Train client teams on prompt engineering best practices and evaluation methods

What We Are Looking For

3+ years of experience working with LLMs, with at least 1 year focused specifically on prompt engineering
Deep understanding of prompting techniques: few-shot learning, chain-of-thought, self-consistency, and structured outputs
Experience with prompt evaluation methods including human evaluation, automated metrics, and LLM-as-judge
Strong analytical skills with ability to design experiments and interpret results systematically
Proficiency in Python for building evaluation frameworks, data processing, and automation
Experience with LLM observability tools (Langfuse, LangSmith, Weights & Biases)
Understanding of different model capabilities: GPT-4, Claude, Llama, Mistral, and their strengths/weaknesses
Familiarity with advanced techniques: RAG integration, tool use, function calling, and agentic patterns
Excellent written communication for crafting prompts and documenting methodologies
Strong attention to detail—prompt engineering often comes down to precise wording choices
Consultative mindset with ability to understand client requirements and translate to prompting strategies
Self-directed work style with ability to drive projects independently in a remote environment

Nice to Have

Background in linguistics, cognitive science, or technical writing
Experience with model fine-tuning and understanding when to fine-tune vs. prompt
Knowledge of prompt injection vulnerabilities and defensive prompting techniques
Experience with constitutional AI and RLHF concepts
Background in specific verticals: legal, medical, financial where precision is critical
Experience with multimodal prompting (vision-language models)
Familiarity with prompt optimization tools and automatic prompt generation
Knowledge of token economics and cost optimization strategies
Experience building prompt management systems or platforms
Prior experience teaching or training others on prompt engineering
Public writing or speaking on prompt engineering topics
Contributions to open source prompt libraries or frameworks

Tech Stack

PythonOpenAI APIAnthropic APILangfuseLangSmithLangChainWeights & BiasesJupyterPandasJSON SchemaPydanticFastAPIPostgreSQLGitStreamlitGradio

Benefits & Perks

Competitive salary: $130,000 - $190,000 depending on experience
Equity participation with meaningful upside as we grow
Fully remote work with flexible hours—work from anywhere in the US
Comprehensive health, dental, and vision insurance (100% premium covered for employee)
Unlimited PTO with encouraged minimum of 4 weeks—we mean it
$3,000 annual learning and development budget for courses, books, and certifications
Conference attendance budget including travel—attend or speak at AI conferences
Top-tier hardware: MacBook Pro, external display, and peripherals of your choice
All AI tools and subscriptions you need: GPT-4, Claude, GitHub Copilot, and more
Quarterly team offsites in interesting locations
401(k) with company match
Paid parental leave (12 weeks)
Home office setup stipend ($1,000)
Work on genuinely interesting problems across diverse industries

Ready to Apply?

Send us your resume and a brief introduction. Tell us about your experience with AI/ML systems and what excites you about this opportunity.

Apply Now

Other Open Positions

Prompt Engineer

About the Role

What You Will Build

Key Responsibilities

What We Are Looking For

Nice to Have

Tech Stack

Benefits & Perks

Ready to Apply?

Other Open Positions

AI/ML Engineer

Sales Automation Engineer

CRM Solutions Architect

Operations Automation Specialist

Business Intelligence Analyst

Cloud Solutions Architect

DevOps Engineer

AI Research Engineer

Executive Operations Coordinator

Full-Stack Developer

Open Source Solutions Engineer