AI Model Architecture Optimization Engineer R&D

Company: OpenInfer
Location: San Mateo
Posted on: May 3, 2025

Job Description:

San Mateo, CAFull-timePosition OverviewWe are looking for an experienced AI Acceleration Engineer who can dive deep into large model (eg. transformer) architectures and blocks such as self/cross/multi-attention, and perform research and development of advanced techniques to accelerate these areas. The ideal candidate will have a deep understanding of large model design, AI acceleration techniques, and will integrate these advancements into the PyTorch stack. Familiarity with Python is essential, and experience with CUDA programming is highly desirable.Key Responsibilities

Innovate on AI model components, such as attention blocks, KV-cache strategies, layer streaming, tokenization, layer norms, and more, to improve AI model performance and scalability.
Optimize and integrate AI acceleration techniques into the PyTorch stack, enabling efficient use across diverse hardware platforms.
Own & drive features end to end to push the limits of large model architecture, ensuring seamless integration with existing frameworks.
Benchmark and profile AI models to evaluate performance improvements, ensuring optimal execution on target hardware.
Write and maintain clean, efficient code in Python, with a focus on integration with PyTorch.
Leverage CUDA for GPU-based acceleration when necessary, optimizing the attention blocks for maximum performance.
Work on cross-functional teams to design, implement, and test new features.Qualifications
- Extensive experience with large AI model architectures, particularly with attention blocks and transformer models.
- Proficiency in Python and hands-on experience with the PyTorch framework.
- Strong understanding of AI acceleration techniques and their application in real-world use cases.
- Familiarity with CUDA for GPU programming is highly desirable.
- Demonstrated ability to optimize complex models for performance across different hardware environments.
- Experience in developing and deploying AI models at scale is a plus.What You'll Gain
  - Opportunity to work alongside industry experts in AI optimization, high-performance computing, and hardware acceleration.
  - Hands-on experience with cutting-edge technologies at the intersection of AI and hardware acceleration.
  - Exposure to open-source development and collaboration with a vibrant community.Benefits We Offer:At OpenInfer we offer comprehensive benefits, some include:
    - Medical, Dental, and Vision benefits for you and your family
    - Flexible Paid Time Off, 10 days
    - 401(k) Plan with company matching
    - Snacks and coffee to keep you energizedThese benefits are further detailed in OpenInfer policies and are subject to change at any time, consistent with the terms of any applicable compensation or benefits plans.How to ApplyPlease send your resume and a brief cover letter to recruiting@openinfer.io. Include examples of your work with large AI models, attention blocks, or open-source contributions where applicable.
      #J-18808-Ljbffr

Keywords: OpenInfer, Vallejo , AI Model Architecture Optimization Engineer R&D, Engineering , San Mateo, California

Click here to apply!

Didn't find what you're looking for? Search again!

Let San Mateo recruiters find you. Post your resume for free!

Get San Mateo Engineering jobs via email.

View more Vallejo Engineering jobs

Other Engineering Jobs

Staff AI/ML Engineer
Description: About NetAppNetApp is the intelligent data infrastructure company, turning a world of disruption into opportunity for every customer. No matter the data type, workload or environment, we help our customers (more...)
Company: NetApp
Location: San Jose
Posted on: 05/5/2025

Applications Engineer
Description: As an Application Engineer for the PI Expert team, you will be responsible for design and development of Power Integrations official design tool - . This role is highly collaborative and requires strong (more...)
Company: Power Integrations, Inc.
Location: San Jose
Posted on: 05/5/2025

Senior Application Engineer ( AC/DC )
Description: This is a senior level position in application/system engineering. Responsibilities include research in new switching power technology, defining system power architecture, digital, analog,
Company: Analog Group
Location: San Jose
Posted on: 05/5/2025

Salary in Vallejo, California Area | More details for Vallejo, California Jobs |Salary

Cloud Engineer
Description: Job DetailsAt Cadence, we hire and develop leaders and innovators who want to make an impact on the world of technology.This role requires developing a best-of-breed platform for Cadence Cloud offerings, (more...)
Company: Devopshunt
Location: San Jose
Posted on: 05/5/2025

Technical Support Engineer (Central/Mtn)
Description: Vectra is the leader in AI-driven threat detection and response for hybrid and multi-cloud enterprises. br The Vectra AI Platform delivers integrated signal across public cloud, SaaS, identity, and (more...)
Company: Vectra
Location: San Jose
Posted on: 05/5/2025

AD Maps Engineer
Description: At Mercedes-Benz Research Development North America MBRDNA , we are committed to delivering world-class automotive technologies that push the boundaries of what is possible. Our teams of highly skilled (more...)
Company: Mercedes Benz R&D North America
Location: San Jose
Posted on: 05/5/2025

Senior Machine Learning Engineer NYC, San Jose, or Remote
Description: Hume AI is seeking a talented software engineer with experience in backend web services and ML infrastructure to advance our core mission: using the world's most advanced technology for emotion understanding (more...)
Company: Hume AI Inc
Location: San Jose
Posted on: 05/5/2025

Senior Interior Engineer, Cabin
Description: Archer is an aerospace company based in San Jose, California building an all-electric vertical takeoff and landing aircraft with a mission to advance the benefits of sustainable air mobility. We are designing, (more...)
Company: Up Closets of North Cincinnati
Location: San Jose
Posted on: 05/5/2025

EHS (Environment, Health, and Safety) Engineer I - P12-010
Description: EHS Environment, Health, and Safety Engineer I - P12-010 br EHS Environment, Health, and Safety Engineer I - P12-010 br Salary Range: br 67,875.00 - 113,125.00 br Job Description: br Confluent (more...)
Company: Confluent Medical Technologies
Location: Fremont
Posted on: 05/5/2025

Sr. Applications Engineer
Description: Sr. Applications EngineerApply remote type: OnsiteLocations: San Jose, CATime type: Full timePosted on: Posted 30 Days AgoJob Requisition ID: R-25Monolithic Power Systems, Inc. MPS is one of the fastest (more...)
Company: Monolithic Power Systems Inc.
Location: San Jose
Posted on: 05/5/2025

Loading more jobs...

AI Model Architecture Optimization Engineer R&D

Didn't find what you're looking for? Search again!

Other Engineering Jobs

Log In or Create An Account