AI Accent Conversion - Krisp Blog https://krisp.ai/blog/category/enterprise/ai-accent-conversion/ Blog Thu, 16 Oct 2025 17:50:15 +0000 en-US hourly 1 https://wordpress.org/?v=5.5.3 https://krisp.ai/blog/wp-content/uploads/2023/03/cropped-favicon-32x32.png AI Accent Conversion - Krisp Blog https://krisp.ai/blog/category/enterprise/ai-accent-conversion/ 32 32 Krisp Launches Accent Conversion for Africa https://krisp.ai/blog/krisp-launches-accent-conversion-for-africa/ https://krisp.ai/blog/krisp-launches-accent-conversion-for-africa/#respond Thu, 16 Oct 2025 13:10:04 +0000 https://krisp.ai/blog/?p=22376 The real-time Voice AI leader brings tested Accent Conversion technology to Africa, one of the world’s fastest-growing customer experience hubs Durban, South Africa — October 16, 2025 — Krisp, the leader in real-time voice AI technology, today announced the launch of Accent Conversion for Africa, enabling clearer and more natural conversations across the country’s customer experience […]

The post Krisp Launches Accent Conversion for Africa appeared first on Krisp.

]]>
The real-time Voice AI leader brings tested Accent Conversion technology to Africa, one of the world’s fastest-growing customer experience hubs

Durban, South Africa — October 16, 2025Krisp, the leader in real-time voice AI technology, today announced the launch of Accent Conversion for Africa, enabling clearer and more natural conversations across the country’s customer experience (CX) sector. As one of the world’s fastest-emerging outsourcing hubs, Africa has a highly skilled, English-speaking workforce with strong cultural alignment to Western markets, positioning it as a strategic bridge for CX operations across Africa and Europe. Krisp Accent Conversion for Africa supports African English accents, including South African, Ugandan, Kenyan, and Nigerian. 

 

The launch builds on the success of Krisp Accent Conversion 3.7, which supports Indian, Pakistani, Filipino, and Latin American English accents. Powering CX operations at tier-1 banks, insurers, and BPOs worldwide, Krisp’s AI-powered solution continues to set the industry benchmark for speech clarity, phoneme precision, and naturalness. Krisp Accent Conversion for Africa delivers near-native comprehension between contact center agents and consumers, demonstrating a higher performance than both competitors and unprocessed voice.

“Even in the age of AI, human agents are at the front lines of every meaningful customer interaction and they deserve to be clearly understood,” said Davit Baghdasaryan, Co-Founder and CEO of Krisp. “As the CX industry evolves to become more AI-driven, one thing remains constant: human connection drives loyalty and trust. With Krisp, clarity becomes universal, not cultural, by removing accent bias and empowering every voice to connect globally.”

Advantages of Krisp Accent Conversion for Africa include:

  • Proven performance + measurable impact: Krisp is already trusted at scale, with 250,000+ enterprise seats deployed and 80B+ minutes processed monthly in real-time conversations. Customers using Krisp have seen +99% NPS from end-customers.
  • Eliminated accent bias: Krisp bridges clarity gaps across Africa’s diverse English accents and native languages.
  • Talent expansion + boosted retention: Krisp accent conversion expands access to CX jobs for agents who might otherwise be excluded and preserves agents’ authentic voices, building confidence and authenticity.
  • Cutting costs: Removes the need for expensive and limiting accent neutralization training.
  • Global competitiveness: Allows operators to hire broadly, without limitations due to accent, and compete more effectively with leading outsourcing hubs like India and the Philippines.

“By integrating Krisp’s AI platform, including Accent Conversion and noise cancellation, we’re amplifying the human touch at every interaction,” said Sudhir Agarwal, Founder & CEO of Everise. “Krisp’s technology has consistently outperformed in head-to-head evaluations across clarity, naturalness, and accent accuracy.”

Krisp’s mission is to enhance the productivity of every voice interaction, which includes eliminating bias and language barriers. By combining advanced voice AI with enterprise-scale reliability, Krisp enables global CX organizations to deliver consistent, high-quality interactions at every touchpoint. 

 

To learn more, visit https://krisp.ai/contact-center/accent-conversion/

 

Media Contact

Molly Leahy

krispPR@walkersands.com

The post Krisp Launches Accent Conversion for Africa appeared first on Krisp.

]]>
https://krisp.ai/blog/krisp-launches-accent-conversion-for-africa/feed/ 0
Krisp Accent Conversion v3.7 Expands to Latin American Accent Pack https://krisp.ai/blog/krisp-accent-conversion-v3-7-expands-to-the-latam-accent/ https://krisp.ai/blog/krisp-accent-conversion-v3-7-expands-to-the-latam-accent/#respond Mon, 06 Oct 2025 16:38:01 +0000 https://krisp.ai/blog/?p=22343 Following the successful rollout of Accent Conversion v3.7 for Indian and Filipino accent packs, we are excited to announce that the Latin American (LatAm) accent pack has now been upgraded to v3.7. This update introduces enhancements to speaker similarity, ensuring overall voice stability. As a result, the converted speech sounds closer to the original voice […]

The post Krisp Accent Conversion v3.7 Expands to Latin American Accent Pack appeared first on Krisp.

]]>
Following the successful rollout of Accent Conversion v3.7 for Indian and Filipino accent packs, we are excited to announce that the Latin American (LatAm) accent pack has now been upgraded to v3.7.

This update introduces enhancements to speaker similarity, ensuring overall voice stability. As a result, the converted speech sounds closer to the original voice preserves the unique qualities of the original voice while remaining clear, stable, and easy to understand.

Key Improvements in LatAm v3.7

  • Speaker Similarity: Noticeably stronger preservation of the original speaker’s voice. Objective evaluations showed a 10% improvement in similarity compared to v3.5.
  • Voice Stability: More consistent pitch and tone throughout speech, eliminating artificial fluctuations and producing a smoother, more natural output.
  • Naturalness: With enhanced similarity and stability, converted speech is perceived as more human-like and fluid. Crowdsourced model comparisons demonstrated a 9% increase in naturalness scores for v3.7.

Evaluation Results

Our evaluation combined both objective metrics and subjective, crowdsourced testing to ensure robust validation:

  • 37 real-world recordings were sampled for evaluation.
  • For the crowdsourced study, each recording received 40 independent votes, yielding a total of 1,480 votes and ensuring statistical confidence in the results.
  • The reported results represent aggregated averages across all recordings.

These findings consistently confirm noticable quality improvements delivered by v3.7.

Metric LatAm AC 3.5 LatAm AC 3.7 Comment
Speaker Similarity (0 to 1) 0.7 0.77 (+10%) Objective metric computing similarity between two voices. The higher, the better.
37 real-world audio recordings assessed by 30 participants
Crowdsourced Evaluation – “How natural does the voice sound?” (1 to 5) 3.35 3.44 (+9%) 37 real-world audio recordings assessed by 30 participants

 

Comparative audio samples

Listening Tip: For the most accurate and immersive comparison between Accent Conversion v3.5 and v3.7, we recommend using quality headphones.

This highlights improvements in clarity, naturalness, and speaker identity preservation that may not be as noticeable on laptop or mobile speakers.

# Improvement Category Original Converted AC v3.5 Converted AC v3.7
1 Speaker Similarity, Voice Stability
2 Speaker Similarity, Speech Naturalness
3 Speaker Similarity
4 Speaker Similarity, Speech Naturalness
5 Speaker Similarity
6 Speaker Similarity

 

 

 

The post Krisp Accent Conversion v3.7 Expands to Latin American Accent Pack appeared first on Krisp.

]]>
https://krisp.ai/blog/krisp-accent-conversion-v3-7-expands-to-the-latam-accent/feed/ 0
Krisp Announces Accent Conversion for Africa https://krisp.ai/blog/krisp-announces-accent-conversion-for-south-africa/ https://krisp.ai/blog/krisp-announces-accent-conversion-for-south-africa/#respond Thu, 02 Oct 2025 13:02:10 +0000 https://krisp.ai/blog/?p=22315 South Africa is rapidly becoming one of the world’s fastest-growing hubs for customer experience delivery. With its highly-skilled, English-speaking workforce, strong cultural alignment with Western markets, and cost-effective operations, the country has become a go-to region for global enterprises and BPOs. The market is projected to continue expanding rapidly, with South Africa positioned as a […]

The post Krisp Announces Accent Conversion for Africa appeared first on Krisp.

]]>
South Africa is rapidly becoming one of the world’s fastest-growing hubs for customer experience delivery. With its highly-skilled, English-speaking workforce, strong cultural alignment with Western markets, and cost-effective operations, the country has become a go-to region for global enterprises and BPOs. The market is projected to continue expanding rapidly, with South Africa positioned as a global gateway for CX across Africa and Europe.

 

Krisp will make Accent Conversion for African English accents—including South African, Ugandan, Kenyan, and Nigerian—available in October 2025, following the release of its 3.7 model for Indian, Pakistani, Filipino, and Latin American support.  It will be fully tested and available for live deployment.

 

Proven performance

Krisp’s Voice AI platform is already trusted at scale:

 

  • 250,000+ enterprise seats deployed
  • 80B+ minutes processed monthly in real-time conversations

 

Krisp’s Accent Conversion delivers measurable impact:

 

  • +99% NPS from end-customers
  • +26.1% in sales conversions
  • +14.8% in revenue per booking

 

Accent clarity is often the deciding factor in customer satisfaction, repeat business, and agent productivity. Krisp Accent Conversion for South Africa ensures conversations remain natural, inclusive, and crystal clear—unlocking the country’s full potential as a CX powerhouse.

 

Accent Conversion takes these advantages further by:

  • Eliminating accent bias: Bridges clarity gaps across South Africa’s diverse English accents and native languages.
  • Unlocking opportunity: Expands access to CX jobs for agents who might otherwise be excluded.
  • Boosting retention: Preserves agents’ natural voices, building confidence and authenticity.
  • Cutting costs: Removes the need for expensive and limiting accent neutralization training.
  • Scaling without limits: Allows operators to hire broadly, without limitations due to accent, and compete globally with India and the Philippines.

 

Krisp continues to set the standard for real-time Voice AI at enterprise scale, giving global operators technology they can trust to deliver measurable results.

 

Krisp Accent Conversion for South Africa will be available for deployment in October. To learn more, visit https://krisp.ai/contact-center/accent-conversion/

The post Krisp Announces Accent Conversion for Africa appeared first on Krisp.

]]>
https://krisp.ai/blog/krisp-announces-accent-conversion-for-south-africa/feed/ 0
Introducing Krisp Accent Conversion v3.7 https://krisp.ai/blog/introducing-krisp-accent-conversion-v3-7/ https://krisp.ai/blog/introducing-krisp-accent-conversion-v3-7/#respond Thu, 07 Aug 2025 07:55:59 +0000 https://krisp.ai/blog/?p=21860 Krisp Accent Conversion v3, released in March 2025, marked a breakthrough moment in the evolution of our accent conversion technology. For the first time in two years, we felt the system was mature enough for wide-scale production use.   In May 2025, we released Accent Conversion v3.5, bringing a major quality upgrade — with ~20% […]

The post Introducing Krisp Accent Conversion v3.7 appeared first on Krisp.

]]>
Krisp Accent Conversion v3, released in March 2025, marked a breakthrough moment in the evolution of our accent conversion technology. For the first time in two years, we felt the system was mature enough for wide-scale production use.

 

In May 2025, we released Accent Conversion v3.5, bringing a major quality upgrade — with ~20% improvement across key metrics for both Filipino and Indian accents (details here). Thanks to Krisp desktop application’s auto-update mechanism, the rollout reached 95% of users within 2 days, and the feedback was overwhelmingly positive, both from agents and customers, driving sentiment and business KPIs.

 

In July 2025, we expanded the offering to support the Latin American accent pack. The launch quickly gained traction with several large customers and is now deployed across thousands of agents.

 

Throughout this period, we’ve worked closely with partners, agents, and customers to deeply understand corner cases — especially for the Indian accent, which is the most challenging due to its vast regional variation and phonetic complexity. This close collaboration, combined with relentless efforts from the world-class research and engineering teams at Krisp, has culminated in another major step forward now.

 

Today, we’re launching Accent Conversion v3.7, delivering significant improvements in naturalness and voice stability. This release is currently focused on the Indian accent pack, with support for other accents rolling out soon.

The following sections summarize the key improvements, benchmarking methodology, and a side-by-side comparison of Accent Conversion v3.7 with v3.5.

Key Improvements in AC v3.7

  1. Naturalness: The converted speech sounds even more human-like and natural, with much improved filler-sound handling. Here, expert-rated naturalness scores improved by +14%. Crowdsourced evaluations confirm it with a +6% gain.
  2. Voice Stability: Enhanced consistency in pitch and tone throughout the utterance, helping avoid unnatural fluctuations, especially for thick accents. This contributed to improved naturalness and clarity scores across all metrics.
  3. Speech & Audio Clarity: Improvements were noted in both intelligibility and the reduction of artifacts and distortions. Speech Clarity scores rose by 5% in expert assessments, with corresponding enhancements across Meta metrics.
  4. Pronunciation Accuracy: There’s a gain in objective metrics as well, about a 4% relative improvement in Phoneme Error Rate (PER), which can be attributed to more conversational data inclusion in the training. Here, some noticeable accent-specific enhancements in phoneme pronunciation, such as more native-like articulation of “R” and “L”, contribute to a +5% increase in the Accent Conversion score.

Evaluation Results

For subjective and objective evaluations, 78 real-world recordings were sampled.

For the crowdsourced evaluation, each recording received exactly 30 independent votes to ensure statistical confidence, 2340 total votes.

The results shown in the table below represent aggregated averages across all recordings.

Metric IN AC v3.5 IN AC v3.7 Comment
Expert Evaluation – Natural speech (1 to 5) 3.7 4.2 (+14%) Speech sounds even more human-like, with much improved filler-sound handling
Expert Evaluation – Speech Clarity (1 to 5) 4.0 4.2 (+5%) Speech is with fewer artifacts and clearer, especially in slurred and mumbling segments
Expert Evaluation – Accent Conversion (1 to 5) 4.3 4.5 (+5%) Accent-specific enhancements in phoneme pronunciation, such as more native-like articulation of “R” and “L”
Crowdsourced Evaluation“How natural does the voice sound?” (1 to 5) 3.4 3.6 (+6%) 78 real-world audio recordings assessed by 30 participants
Crowdsourced Models’ ComparisonWhich option sounds more natural? 1242 1878 (+20%) 78 real-world audio recording pairs were evaluated, with each pair assessed by 40 participants
Meta Aesthetic – Natural speech (1 to 10) 5.6 5.8 (+4%)
Meta Aesthetic – Speech Clarity (1 to 10) 7.5 7.6 (+1%)

 

Comparative audio samples

Listening Tip: For the most accurate and immersive comparison between v3.5 and v3.7 Accent Conversion, we recommend using quality headphones.

This helps highlight the improvements in clarity, naturalness, and speaker identity preservation that may be less perceptible on laptop or mobile speakers.

# Improvement Category Original Converted AC v3.5 Converted AC v3.7
1 Speech Naturalness
2 Speech Naturalness
3 Speech Naturalness
Speech Clarity
4 Speech Clarity
5 Speech Clarity
Speech Naturalness
Voice Stability
6 Speech Clarity
Speech Naturalness
Voice Stability
7 Speech Naturalness
Speech Clarity
8 Speech Naturalness
Speech Clarity

 

Appendix

Subjective Evaluation

Our evaluation was conducted across two structured tracks: expert panel ratings and crowdsourced listener preferences, designed to capture both technical precision and human perception.

Real-world agent calls have been sampled to represent a diverse set of speakers and input conditions, including, but not limited to

  • Accent level – high, medium, low
  • Speech rates and fluency
  • Background conditions (quiet, noisy, multi-speaker environments)

Evaluators scored each recording across four qualitative dimensions using a 5-point Likert scale:

Score Meaning
5 Excellent / Native-like
4 Very Good
3 Acceptable
2 Needs Improvement
1 Poor / Unintelligible

1. Expert Panel Evaluation

Six expert evaluators independently rated matching audio pairs — each pair consisting of the same original voice converted by AC v3.5 and AC v3.7.

To eliminate bias:

  • File names were anonymized (no version markers)
  • The order of samples was randomized
  • Scoring was blind and individual (no group discussion)

2. Crowdsourced Evaluation

To further simulate real-world user perception, a blind A/B test was run with a pairs of recordings: AC v3.5 vs. AC v3.7.
78 real-world audio recording pairs were evaluated, with each pair assessed by 40 participants, resulting in 3,120 votes overall.

Participants were asked the following question:
“Which option sounds more natural (i.e., more human-like)?”

Results:

  • Version 3.5 was selected 1242 times
  • Version 3.7 was selected 1878 times

Evaluation metrics

Accent Conversion performance was measured across four key dimensions. These were selected based on real-world call center priorities such as clarity, naturalness, and robustness.

Metric Description
Accent Conversion How effectively the speaker’s original accent is transformed into a neutral or target accent. High scores mean minimal accent leakage or trace of the original pronunciation.
Speech Clarity Evaluates articulation, intelligibility, and absence of audio distortions, such as mumbling, muffling, or low vocal energy.
Natural Speech Measures how closely the output resembles fluid, human-like speech, including natural variations in pitch, tone, rhythm, and intonation.
Pronunciation Accuracy Measures how closely the converted speech matches standard American English pronunciation at the phoneme level. It evaluates whether individual sounds (vowels, consonants, syllables) are produced correctly and consistently, without distortion, misplacement, or omission, ensuring that the converted voice sounds intelligible and native-like to a U.S. listener.

Objective Evaluation

For objective evaluation, the same set of recordings was processed using the Meta Audiobox Aesthetics and captured metrics strongly correlated to Natural Speech and Speech Clarity. Additionally, to quantify how each system impacts phoneme accuracy, all recordings were also processed using the Facebook NN Phonemizer, which is strongly correlated with the Accent Conversion metric.

Objective Metric Interpretation Highly Correlated to Subjective Metric What It Captures
Production Quality* Higher is better Speech Clarity Fidelity, presence of audio artifacts, balance, and clarity of the output signal
Content Enjoyment* Higher is better Natural Speech Perceived naturalness, fluidity, and enjoyment of listening — akin to human listening satisfaction
Phoneme Error Rate (PER) Lower is better Accent Conversion Measures pronunciation distortion. Lower scores mean more accurate, intelligible speech with better articulation.
  •  these metrics are derived from waveform-level analysis and do not require transcript or linguistic alignment, making them ideal for evaluating accent conversion outputs that vary in delivery and prosody.

The post Introducing Krisp Accent Conversion v3.7 appeared first on Krisp.

]]>
https://krisp.ai/blog/introducing-krisp-accent-conversion-v3-7/feed/ 0
Krisp Unveils Industry-First AI Accent Conversion for LATAM https://krisp.ai/blog/krisp-unveils-industry-first-ai-accent-conversion-for-latam/ https://krisp.ai/blog/krisp-unveils-industry-first-ai-accent-conversion-for-latam/#respond Thu, 05 Jun 2025 10:03:00 +0000 https://krisp.ai/blog/?p=21693 BERKELEY, CA, June 5, 2025 — Krisp, the leader in AI-powered voice technology, announced today the launch of AI Accent Conversion for Latin America, the first AI-powered voice provider to offer accent conversion services to this region. Krisp supports five Latin American English accents that represent approximately 85% of Spanish speakers across the major dialect […]

The post Krisp Unveils Industry-First AI Accent Conversion for LATAM appeared first on Krisp.

]]>
BERKELEY, CA, June 5, 2025 — Krisp, the leader in AI-powered voice technology, announced today the launch of AI Accent Conversion for Latin America, the first AI-powered voice provider to offer accent conversion services to this region. Krisp supports five Latin American English accents that represent approximately 85% of Spanish speakers across the major dialect groups in Latin America, including Mexican and Central American, Caribbean Spanish and Andean Spanish/Neutral Standard Spanish. 

 

AI Accent Conversion utilizes real-time inflection changes to help customers understand agents better by dynamically changing agents’ accents into the customer’s natively understood accent. By allowing agents to refine Latin American accents live on a call, Krisp helps agents and customers understand each other while keeping voices natural and authentic. As contact centers continue to use AI-powered tools to improve productivity, this update empowers call center agents and customers to have improved and clear conversations without altering their identity. 

 

“Krisp is proud to be the first to bring AI Accent Conversion to Latin America, an incredibly diverse region where authentic communication is key,” said Davit Baghdasaryan, CEO and co-founder of Krisp. “By supporting the most widely spoken Latin American English accents, we’re not just improving call clarity but helping to bridge cultural and linguistic gaps in real time. At Krisp, our goal has always been to enable human conversations that are clear, with nuance and without background noise.”

 

Other benefits of AI Accent Conversion include: 

  • Boosted agent productivity: Allows agents to focus on communication content and solving the customer’s inquiry rather than mitigating the accent barrier, leveling up the productivity.
  • Enhanced fairness in recruitment: Krisp removes the native English accent requirement, broadening the talent pool and promoting diversity in hiring.
  • Reduced bias: Mitigates customer bias against call center agents’ accents, boosting confidence and fostering a supportive work environment.
  • Improved agent interaction quality: Eliminates the need for accent faking, enhancing employee and agent satisfaction and well-being.

 

With the Latin American update, Krisp’s AI Accent Conversion tool now supports Spanish, Indian and Filipino dialects, with new dialects coming this year, such as South African and non-US English accents. The newest update also includes improved naturalness and voice preservation for Indian and Filipino accents, and improved accent leakage and phoneme precision. No other provider is delivering accent localization in LATAM with real-time accent conversion, significantly improving agent and customer satisfaction by enabling clearer conversations. 

 

To learn more, visit https://krisp.ai/

About Krisp

Founded in 2017, Krisp pioneered the world’s first AI-powered Voice Productivity software. Krisp’s Voice AI technology enhances digital voice communication through audio cleansing, noise cancellation, accent conversion, live speech-to-speech translation, and agent assist. Offering full privacy, Krisp works on-device, across all audio hardware configurations and applications that support digital voice communication. Today, Krisp is deployed on over 200 million devices, has transcribed over 40 million calls and processes over 80 billion minutes of voice conversations every month, helping businesses harness the power of voice to unlock higher productivity and deliver better business outcomes. 

Learn more at www.krisp.ai 

 

Media Contact

Molly Leahy

krispPR@walkersands.com

The post Krisp Unveils Industry-First AI Accent Conversion for LATAM appeared first on Krisp.

]]>
https://krisp.ai/blog/krisp-unveils-industry-first-ai-accent-conversion-for-latam/feed/ 0
Introducing Accent Conversion v3.5 https://krisp.ai/blog/introducing-accent-conversion-v3-5/ https://krisp.ai/blog/introducing-accent-conversion-v3-5/#respond Thu, 22 May 2025 14:24:59 +0000 https://krisp.ai/blog/?p=21602 Krisp Accent Conversion v3.5 represents a significant upgrade over the previous v3.0 release. Both Indian and Filipino accent models show consistent improvements across clarity, naturalness, and pronunciation accuracy, validated through expert evaluation, crowdsourced ratings, and objective metrics. Overall, the v3.5 models deliver clearer, more natural, and more intelligible speech while preserving speaker identity.   Key […]

The post Introducing Accent Conversion v3.5 appeared first on Krisp.

]]>
Krisp Accent Conversion v3.5 represents a significant upgrade over the previous v3.0 release. Both Indian and Filipino accent models show consistent improvements across clarity, naturalness, and pronunciation accuracy, validated through expert evaluation, crowdsourced ratings, and objective metrics. Overall, the v3.5 models deliver clearer, more natural, and more intelligible speech while preserving speaker identity.

 

Key Improvements in AC v3.5

  1. Speech & Audio Clarity: Major improvements in intelligibility and reduction of audio artifacts and distortions. Speech Clarity scores increased by +18% (Indian) and +23% (Filipino) in expert evaluations, with consistent boosts across Meta metrics as well.
  2. Naturalness & Fluidity: Speech sounds more human and expressive, with better rhythm, pacing, and filler sound handling. Expert-rated Natural Speech scores improved by +18% (Indian) and +20% (Filipino). Crowdsourced evaluations confirm this with +10% (Indian) and +6% (Filipino) gains.
  3. Pronunciation Accuracy: Improved phoneme articulation and intelligibility reflected in a fairly significant 10% reduction in Phoneme Error Rate (PER) for the Indian accent pack.
  4. Voice Stability: Enhanced consistency in pitch and tone throughout the utterance, helping avoid unnatural fluctuations. This contributed to improved naturalness and clarity scores across all metrics.
  5. Speaker Identity Retention: v3.5 models better preserve the original speaker’s voice characteristics, resulting in more personalized and authentic-sounding output, evident in higher naturalness ratings across both subjective and objective evaluations.

 

Evaluation Results

For subjective and objective evaluations, 78 real-world recordings were sampled for the Indian accent pack and 57 for the Filipino accent pack.

 

For the crowdsourced evaluation, each recording received exactly 30 independent votes to ensure statistical confidence — 2340 total votes for Indian recordings and 1710 for Filipino recordings.

 

The results shown in the table below represent aggregated averages across all recordings.

 

Metric IN AC v3 IN AC v3.5 PH AC v3 PH AC v3.5
Expert evaluation –
Natural speech (1 to 5)
3.3  3.9 (+18%) 3.4  4.1 (+20%)
Expert evaluation –
Speech clarity (1 to 5)
3.4  4.0 (+18%) 3.4  4.2 (+23%)
Crowdsourced evaluation –
How natural does
the voice sound?
” (1 to 5)
3.1  3.4 (+10%) 3.3  3.5 (+6%)
Meta Aesthetic –
Natural speech (1 to 10)
5.4  5.6 (+4%) 5.4  5.7 (+6%)
Meta Aesthetic –
Speech clarity (1 to 10)
7.1  7.5 (+6%) 7.1  7.5 (+6%)
Phoneme Error Rate (PER) 26.1%  24% (−10%) 28.4% = 28.4% (no change)

 

Comparative audio samples

Listening Tip: For the most accurate and immersive comparison between v3.0 and v3.5 Accent Conversion, we recommend using quality headphones.

This helps highlight the improvements in clarity, naturalness, and speaker identity preservation that may be less perceptible on laptop or mobile speakers.

Indian English accent pack

Improvement category Original speech Converted AC V3 Converted AC V3.5
Voice stability

Speech clarity

Speech naturalness

Voice stability

Speech clarity

Speech naturalness

Speech clarity

Speech naturalness

Speech clarity

Speech naturalness

Audio quality

Speech clarity

Speech naturalness

Filipino English accent pack

Improvement category Original speech Converted AC V3 Converted AC V3.5
Audio quality

Speaker identity
preservation

Speech naturalness

Audio quality
Speech clarity

Speech naturalness

Audio quality

Speaker identity
preservation

Audio quality

Speech clarity

Speech naturalness

 

Appendix

Subjective evaluation

 

Our evaluation was conducted across two structured tracks: expert panel ratings and crowdsourced listener preferences, designed to capture both technical precision and human perception.

 

Real-world agent calls have been sampled to represent a diverse set of speakers and input conditions, including, but not limited to

 

  • Accent level – high, medium, low
  • Speech rates and fluency
  • Background conditions (quiet, noisy, multi-speaker environments)

 

Evaluators scored each recording across four qualitative dimensions using a 5-point Likert scale:

Score Meaning
5 Excellent / Native-like
4 Very Good
3 Acceptable
2 Needs Improvement
1 Poor / Unintelligible

1. Expert Panel Evaluation

 

Six expert evaluators independently rated matching audio pairs — each pair consisting of the same original voice converted by AC v3 and AC v3.5.

 

To eliminate bias:

 

  • File names were anonymized (no version markers)
  • The order of samples was randomized
  • Scoring was blind and individual (no group discussion)

 

2. Crowdsourced Evaluation

To further simulate real-world user perception, a blind A/B/C test was run with a trio of recordings: original vs. AC v3 vs. AC v3.5.

 

Respondents asked a single question – “How natural does the voice sound?”, and scored recordings using the same 5-point Likert scale:

 

Evaluation metrics

Accent Conversion performance was measured across four key subjective and objective dimensions. These were selected based on real-world call center priorities such as clarity, naturalness, and robustness.

Metric Description
Accent Conversion How effectively the speaker’s original accent is transformed into a neutral or target accent.
High scores mean minimal accent leakage or trace of the original pronunciation.
Speech Clarity Evaluates articulation, intelligibility, and absence of audio distortions,
such as mumbling, muffling, or low vocal energy.
Natural Speech Measures how closely the output resembles fluid, human-like speech,
including natural variations in pitch, tone, rhythm, and intonation.

 

Objective evaluation

For objective evaluation, the same set of recordings was processed using the Meta Audiobox Aesthetics and captured metrics strongly correlated to Natural Speech and Speech Clarity. Additionally, to quantify how each system impacts phoneme accuracy, all recordings were also processed using the Facebook NN Phonemizer, which is strongly correlated with the accent conversion metric.

 

Objective metric Interpretation Highly correlated to subjective metric What it captures
Production quality* Higher is better Speech clarity Fidelity, presence of audio artifacts, balance, and clarity of the output signal
Content enjoyment* Higher is better Natural speech Perceived naturalness, fluidity, and enjoyment of listening — akin to human listening satisfaction
Phoneme Error Rate (PER) Lower is better Accent conversion Measures pronunciation distortion. Lower scores mean more accurate, intelligible speech with better articulation.

 

* These metrics are derived from waveform-level analysis and do not require transcript or linguistic alignment, making them ideal for evaluating accent conversion outputs that vary in delivery and prosody.

The post Introducing Accent Conversion v3.5 appeared first on Krisp.

]]>
https://krisp.ai/blog/introducing-accent-conversion-v3-5/feed/ 0
Krisp vs Sanas: Accent Conversion Comparison https://krisp.ai/blog/krisp-vs-sanas-accent-conversion-comparison/ https://krisp.ai/blog/krisp-vs-sanas-accent-conversion-comparison/#respond Mon, 21 Apr 2025 12:33:17 +0000 https://krisp.ai/blog/?p=21433 What’s behind great Accent Conversion technology This document is intended for contact center operators to assess the quality and performance of Krisp AI Accent Conversion (also known as Accent Translation, Accent Neutralization, or Accent Smoothing) with Sanas’s offering.  Enhanced voice quality in agent-customer interactions, driven by accent conversion, generates ROI based on lower AHT, faster […]

The post Krisp vs Sanas: Accent Conversion Comparison appeared first on Krisp.

]]>
What’s behind great Accent Conversion technology

This document is intended for contact center operators to assess the quality and performance of Krisp AI Accent Conversion (also known as Accent Translation, Accent Neutralization, or Accent Smoothing) with Sanas’s offering. 

Enhanced voice quality in agent-customer interactions, driven by accent conversion, generates ROI based on lower AHT, faster FCR, and improved CSAT and ESAT scores. 

Krisp and Sanas applications are deployed on the agents’ desktops and function as virtual microphones and speakers, working as companion applications with calling platforms. Delivering on the promise of smoothing difficult accents while maintaining clear voice quality in real-time is challenging and takes years to perfect.

There are a few technical challenges that make the task difficult:

  • Removing background noises and voices
  • Synthesizing a natural-sounding speech
  • Ensuring accurate pronunciation of different words
  • Conveying emotions
  • Doing all the above in various real-life situations (fast speech, acoustic conditions, different speakers, etc)

Krisp launched its first commercial application in contact centers in 2019 and has processed over 1 trillion minutes of voice calls. Today, Krisp is deployed across many BPOs and top-tier enterprise call centers, along with also being integrated into voice applications with more than 200 million users across both desktop and mobile devices.

The table below highlights the key performance and management requirements for delivering tier-1 voice fidelity that scales globally within contact centers.

Krisp vs Sanas

Speech naturalness

Krisp
Sanas
Current deployments
  • Over 200 million desktop and mobile devices
  • Over 200K contact center agents
  • Over 1 trillion minutes of Krisp-processed voice
  • Embedded into world-class services such as Vonage, RingCentral, Zoho, Aircall, Discord, others
  • Over 30K agents

Accent Conversion robustness

Supported accent packs
  • Indian English
  • Filipino English
  • Indian English
  • Filipino English
Audio Latency 220ms 350ms-450ms
Modes of operation
  • Voice Preservation mode – fully preserves the user’s voice
  • Voice Profiles mode – allows the user to choose a natural-sounding output voice
Voice Preservation mode – somewhat preserves the user’s voice
Scalable range of output voices Yes
Can generate new voices in Voice Profiles mode
No
Limited to the user’s voice
Accent leakage
  • Some leakage in Voice Preservation mode
  • No leakage in Voice Profiles mode
Consistently observed leakage
Background noise and voice cancellation robustness Highly robust, automatically included in the Accent Conversion models Very limited
Agent and customer-side noise cancellation Bi-directional, automatically included in the Accent Conversion models Customer-side only
Headset robustness Highly robust Requires specific headsets
Robustness across users Works consistently across all users Requires testing three different versions for each user
Wrong pronunciations Some Noticeably more frequent
Preserves user’s voice Yes Limited
User enrollment needed No No
Dynamic adaptation to new speakers Yes, within the same or different call, regardless of the gender Unknown

Requires an output voice gender selection

Voice quality 16khz (wide-band, VOIP, industry-leading voice quality) 8kHz only

Noise Cancellation robustness

Voice quality and noise cancellation World’s best, based on objective and subjective tests (see and hear) New entrant, tests show noise leakage and voice quality degradation (see and hear)
Agent-side Background Voice Cancellation World’s best (see test measurements) Other voices and background chatter leakage when in a typical loud call center
Agent-side Noise Cancellation World’s best (see test measurements) Adequate performance for low-volume noises (fan, for example)

Noise leakage and voice degradation in contact center environments (other voices, loud chatter)
Customer-side Noise Cancellation Included
Optimized for inbound voice from mobile or landline. Pass-through of ringtones, dialtones, etc.
Not available
Acoustic Echo Cancellation Included
Optimized for call center use cases
Not available
Voice quality
  • 8kHz (narrow-band, standard telephony, good voice quality)
  • 16khz (wide-band, VOIP, industry-leading voice quality)
  • 32kHz (full-band, best voice quality – near studio-grade)
8kHz only

Application and audio drivers robustness

CPU utilization
  • Supports range of CPUs typically in agent desktops
  • Supports older, lower-end CPUs through smaller models
  • Has auto-switching between models based on CPU load
  • Single model uses 2x more than Krisp on i5-8th Gen CPU
  • Error message in Sanas app with older CPUs
  • Slightly higher CPU utilization for CPUs beyond i5 12th gen
Audio drivers  Highly reliable and tested for 7+ years Users often need to restart the drivers to avoid breakdown of mic and speaker audio streams.
Headset and application compatibility Compatible and tested with most headsets and voice applications used in call centers New entrant, minimal deployments and testing

Management and deployment at scale

Supported platforms Win, Mac, Linux, Chrome, VDI Win
Installation package Single installation package including all accent packs and noise cancellation A separate package for different accent packs
A separate package for noise cancellation
SSO authentication
  • Available for agents, per the enterprise customers’ requirements
  • SSO/SCIM for automated provisioning and deprovisioning, saving admins’ time
  • Not available for users (agents)
  • Only available for admins
Remote deployment and settings for admins Highly Scalable Very Limited
App version management and auto-update Highly Scalable Very Limited
Analytics for Accent Conversion, Noise Cancellation and platform usage Available Not available
Enterprise-Grade Support 24/7

Application and IT infrastructure expertise during pilots and post-launch, including VDI

24/7

Limited

Krisp vs. Sanas: in-depth accent conversion evaluation

In this section, we present a comparative summary of evaluations conducted on Krisp and Sanas Accent Conversion technologies. These evaluations reflect real-world usage scenarios and benchmark data commonly produced by enterprise customers and BPOs assessing solution fit and performance. The report compiles insights from subjective and objective comparisons across key speech quality metrics, with the aim of informing product decisions and highlighting relative strengths and limitations.

The evaluation results summarized here focus specifically on the Filipino English accent configuration, a commonly requested use case in global support environments.

Subjective evaluation

The evaluation was conducted across two structured tracks: expert panel ratings and crowdsourced listener preferences, designed to capture both technical precision and human perception.

1. Expert panel evaluation

Six expert evaluators independently rated 70 matched audio pairs — each pair consisting of the same original voice converted by Krisp and Sanas. The recordings represented a diverse set of male/female speakers and input conditions, including, but not limited to

  • Accent level – high, medium, low
  • Speech rates and fluency
  • Background conditions (quiet, noisy, multi-speaker environments)

Each recording was processed with Krisp and Sanas’ accent conversion software using a combination of VB-Audio Virtual Cable and Audacity tools to generate a pair of matching recordings.

Evaluators scored each recording across four qualitative dimensions using a 5-point Likert scale:

 

Score Meaning
5 Excellent / Native-like
4 Very Good
3 Acceptable
2 Needs Improvement
1 Poor / Unintelligible

 

To eliminate bias:

  • File names were anonymized (no brand markers)
  • The order of samples was randomized
  • Scoring was blind and individual (no group discussion)

2. Crowdsourced A/B testing

To further simulate real-world user perception, a blind A/B test was run with a subset of 57 real-world, anonymized audio pairs. Each pair of recordings was voted on exactly 60 times.

Each respondent was asked: “Which voice sounds more natural?”

3420 responses were gathered, giving a statistically significant insight into the perceived naturalness of the two accent conversion solutions. Each participant evaluated randomly selected samples, with no access to brand or source information.

Evaluation metrics

Accent Conversion performance was measured across four key subjective and objective dimensions. These were selected based on real-world call center priorities such as clarity, naturalness, and robustness.

Metric Description
Accent Conversion How effectively the speaker’s original accent is transformed into a neutral or target accent. High scores mean minimal accent leakage or trace of the original pronunciation.
Speech Clarity Evaluates articulation, intelligibility, and absence of audio distortions, such as mumbling, muffling, or low vocal energy.
Natural Speech Measures how closely the output resembles fluid, human-like speech, including natural variations in pitch, tone, rhythm, and intonation.
Background Noise/Voice Robustness Assesses the system’s ability to isolate the speaker’s voice and maintain quality when external noises or secondary voices are present.

Objective evaluation methodology and metrics

To complement human evaluations, a structured objective analysis was conducted using state-of-the-art tools to quantify speech quality and pronunciation accuracy in Krisp and Sanas accent-converted outputs. These metrics offer an additional, unbiased perspective into the perceptual and technical performance of each solution.

For objective evaluation, the same 70 pairs of recordings were processed using the Meta Audiobox Aesthetics and captured metrics strongly correlated to Natural Speech and Speech Clarity.

Accent conversion often alters pronunciation patterns. To quantify how each system impacts phoneme accuracy, all recordings were also processed using the Facebook NN Phonemizer, which is strongly correlated with the accent conversion metric.

Objective Metric Interpretation Highly Correlated to Subjective Metric What It Captures
Production Quality* Higher is better Speech Clarity Fidelity, presence of audio artifacts, balance, and clarity of the output signal
Content Enjoyment* Higher is better Natural Speech Perceived naturalness, fluidity, and enjoyment of listening — akin to human listening satisfaction
Phoneme Error Rate (PER) Lower is better Accent Conversion Measures pronunciation distortion. Lower scores mean more accurate, intelligible speech with better articulation.

* – These metrics are derived from waveform-level analysis and do not require transcript or linguistic alignment, making them ideal for evaluating accent conversion outputs that vary in delivery and prosody.

Evaluation results

The following table summarizes the subjective and objective performance of Krisp vs. Sanas across key evaluation dimensions:

Metric Type Krisp Sanas Winner
Accent Conversion Subjective ✅ 3.6/5 ❌ 3.0/5 Krisp
Natural Speech Subjective 🟰 3.7/5 🟰 3.6/5 Near Tie
Speech Clarity Subjective ✅ 4.3/5 ❌ 3.7/5 Krisp
Background Noise/Voice Robustness Subjective ✅ 4.6/5 ❌ 3.9/5 Krisp
Which recording sounds more natural?
Preferred by (# votes / total responses)
Subjective ✅1875/3420 ❌1545/3420 Krisp
Natural Speech* Objective ✅ 5.8/10 ❌ 4.7/10 Krisp
Speech Clarity* Objective ✅ 7.5/10 ❌ 5.3/10 Krisp
Phoneme Error Rate (PER) Objective ✅ 29.3% ❌ 40.7% Krisp

Main Takeaways

  • Krisp leads across all critical performance metrics — both human-perceived and objectively measured — showing superior clarity, intelligibility, and accent transformation accuracy.
  • Accent Conversion: Krisp delivers more effective accent neutralization with fewer traces of the original pronunciation. Sanas often leaks source accent elements and produces less consistent results across varied speakers and speech patterns.
  • Speech Clarity & Phoneme Accuracy: Krisp-converted speech is significantly easier to understand. Sanas samples frequently exhibit muffled segments or slurred phonemes, which negatively affect comprehension and usability in customer support settings.
  • Background Noise Robustness: Krisp maintains speech quality in real-world noisy conditions, including multi-speaker and contact center environments. Sanas, by contrast, is more susceptible to background voice leakage — a potential liability for call quality and privacy.
  • Audio Quality and Bandwidth: Krisp outputs at 16 kHz wideband audio, providing richer, more intelligible voice quality, especially in modern platforms like Zoom, MS Teams, and G.722-based telephony. Sanas outputs audio at 8 kHz, which can degrade quality in high-bandwidth environments and limit downstream use in QA systems.
  • Compatibility and Headset Dependence: Sanas performance appears dependent on specific headsets to avoid secondary voice artifacts. Krisp, by contrast, is hardware-agnostic and built with a robust, production-grade Background Voice Cancellation AI model.

 

Comparative audio samples

# Observation Audio
1 – Strong accent leakage in Sanas
– Pronunciation error of “across”, “travel” in Sanas
– Krisp version is more natural and easier to understand
– Krisp fixed “trouble” in original speech to “travel”
Original

Sanas

Krisp
 2 – Strong accent leakage in Sanas
– Slurred and unintelligible speech in Sanas
– Krisp version is more natural and easier to understand
Original

Sanas

Krisp
3 – Robotic, slurred and unintelligible speech in Sanas
– Krisp version is more natural and easier to understand
Original

Sanas

Krisp
4 – Pronunciation errors of “interested”, “especially” , “gives” in Sanas speech
– Muffled “every day” in Sanas
– Strong accent leakage in Sanas
– Pronunciation error of “hobbies” in original, Krisp, Sanas versions
Original

Sanas

Krisp
5 – Muffled output on “a smooth” in Sanas
– Better naturalness, higher quality voice in Krisp
Original

Sanas

Krisp
6 – Strong accent leakage in Sanas
– Pronunciation errors of “range” in Sanas
Original

Sanas

Krisp
7 – Pronunciation errors of “permission”, “dialer” in Sanas
– Krisp version is more natural and easier to understand
Original

Sanas

Krisp
8 – Strong accent leakage in Sanas
– Krisp fixed “trouble” in original speech to “travel”, Sanas did not
Original

Sanas

Krisp
9 – Strong accent leakage in Sanas
– Muffled “financial” in Sanas
– Krisp version is more natural and easier to understand
Original

Sanas

Krisp
10 – Strong accent leakage in Sanas
– Pronunciation errors of “support”, “workforce”, in Sanas
– Muffled “more”, “Colombia” in Sanas
Original

Sanas

Krisp
11 – Strong secondary voice leakage in Sanas
– All secondary voices cleaned in Krisp
Original

Sanas

Krisp
12 – Strong secondary voice leakage in Sanas
– All secondary voices cleaned in Krisp
Original

Sanas

Krisp
13 – Moderate accent leakage in Sanas
– Muffled words in Sanas
– Krisp audio is easier to understand
Original

Sanas

Krisp
14 – Muffled output on “book an appointment” in Sanas
– Excellent naturalness and smoothness in Krisp
– Pronunciation error in “checkup” in Original, Sanas, and Krisp
Original

Sanas

Krisp
15 – Pronunciation error in “favorite” in Sanas
– Pronunciation error in “for me” in Original, Sanas, and Krisp
Original

Sanas

Krisp
16 – Muffled “sign the order” in Sanas
– More stable volume in Krisp
Original

Sanas

Krisp
17 – Pronunciation errors of “professional”, “tech support” in Sanas
– Low volume, muffled “tech support” in in Krisp
Original

Sanas

Krisp

 

Krisp is a trusted vendor on G2

With over 500 reviews on G2, Krisp consistently excels in enhancing customer interactions for service teams. G2, a trusted platform for software reviews and assessments, showcases Krisp’s exceptional 4.7 rating—earned through the trust and endorsement of hundreds of verified professionals across diverse industries.

Check Krisp’s page on G2 here.

 

Krisp Voice AI Platform for call centers

Krisp is the only real-time Voice AI platform that covers every stage of the agent experience—before, during, and after the call—within a single, lightweight application. It eliminates the need to juggle multiple tools and services by delivering core capabilities like Noise Cancellation, Accent Conversion, Live Interpretation, real-time agent assist, and post-call summaries in one seamless interface.

 

Agents with pronounced English accents can benefit from Accent Conversion, which enhances comprehension in calls without altering their original voice. The same agents can handle international calls using Live Interpreter, enabling real-time multilingual conversations across 80+ languages with one click, directly in the Krisp app. This flexibility removes hiring constraints, the need for the standard language line services, and allows teams to scale globally without friction.

 

During the call, Krisp Agent Copilot provides real-time transcripts, key moment capture, and access to company and industry-specific knowledge via AI Chat, boosting confidence and precision. After the call, automatic summaries and reports help streamline follow-ups and coaching. All of this is centrally managed, with analytics and policy controls available in a unified Admin Portal.

Krisp platform easily integrates with the agent’s desktop to seamlessly work with all CCaaS and calling applications, delivering call quality that translates to much better CSAT and related contact center KPIs.

 

Conclusion

While both Krisp and Sanas are innovators in the Accent Conversion space, Krisp stands out as the enterprise-ready solution trusted by global contact centers.

Krisp’s Accent Conversion consistently delivers clearer, more natural, and more intelligible speech, with significantly lower accent leakage and superior performance in noisy, real-world environments. Across both subjective human evaluations and objective acoustic metrics, Krisp leads on all critical dimensions—accent conversion, speech clarity, and background noise robustness.

Beyond voice quality, Krisp offers:

  • Superior deployment flexibility, working seamlessly across any headset, desktop, or CCaaS platform—no hardware or system limitations.Built-in voice and noise cancellation, tested over 1 trillion minutes, eliminating the need for additional tools or packages.
  • Enterprise-grade reliability with SSO, auto-updates, analytics, and remote configuration for admins—essential for scaled contact center rollouts.
  • A full Voice AI platform in one app: Live Interpreter, Accent Conversion, Agent Copilot, Auto QA, and real-time knowledge—all integrated with one-click simplicity.

Sanas requires device-specific setups, lacks compatibility in key environments, and struggles with accent consistency and audio fidelity, especially in enterprise use cases.

💡 If your goal is to scale globally, serve diverse customers, and ensure your agents are clearly understood — Krisp is the clear choice for Accent Conversion that works, performs, and scales.

The post Krisp vs Sanas: Accent Conversion Comparison appeared first on Krisp.

]]>
https://krisp.ai/blog/krisp-vs-sanas-accent-conversion-comparison/feed/ 0
Krisp Launches Patented AI Accent Conversion v3, Setting a New Standard in Accent Technology https://krisp.ai/blog/krisp-launches-patented-ai-accent-conversion-v3/ https://krisp.ai/blog/krisp-launches-patented-ai-accent-conversion-v3/#respond Tue, 01 Apr 2025 02:15:49 +0000 https://krisp.ai/blog/?p=21304 BERKELEY, CA, April 1, 2025—Krisp, the leader in AI-powered voice technology, today announced the launch of AI Accent Conversion v3, the latest version of its breakthrough real-time speech-to-speech technology that enhances accent clarity while preserving the speaker’s natural voice. Backed by recently awarded patents, the newest model removes communication barriers in contact centers, allowing agents […]

The post Krisp Launches Patented AI Accent Conversion v3, Setting a New Standard in Accent Technology appeared first on Krisp.

]]>
BERKELEY, CA, April 1, 2025—Krisp, the leader in AI-powered voice technology, today announced the launch of AI Accent Conversion v3, the latest version of its breakthrough real-time speech-to-speech technology that enhances accent clarity while preserving the speaker’s natural voice. Backed by recently awarded patents, the newest model removes communication barriers in contact centers, allowing agents to communicate clearly and naturally without altering their identity.

 

Krisp’s AI Accent Conversion v3 refines clarity while maintaining tone, personality, and intonation. This advancement helps global contact centers improve customer satisfaction, sales conversions, and agent confidence by reducing misunderstandings and bias in voice interactions.

 

When enabled, AI Accent Conversion modifies the speaker’s accent in real time with just 200ms of processing latency—an imperceptible delay in conversations.

Patented AI Accent Conversion Technology, Built for Real-World CX

Krisp’s newly awarded patents enable highly natural, real-time accent conversion while preserving voice identity. These breakthroughs power Krisp’s ability to deliver on-device, real-time accent clarity without requiring complex IT infrastructure.

 

Accent barriers have long created friction in global contact centers, leading to miscommunication, increased call handle times, and hiring limitations. Krisp’s patented technology solves these challenges by instantly improving speech clarity while preserving a speaker’s unique voice—making conversations more natural and bias-free.

 

“AI Accent Conversion is ushering a new era in voice AI,” said Davit Baghdasaryan, CEO of Krisp. “We’re raising the bar for how AI enhances communication, ensuring contact centers can hire agents based on talent, not language, and empowering global teams to connect with customers better than ever before.”

Proven Impact: arrivia Case Study

Leading travel company arrivia implemented AI Accent Conversion to improve customer-agent communication and saw:

  • 99% increase in NPS
  • 26.1% boost in sales conversions
  • 14.8% increase in revenue per booking
  • 4.4% reduction in call handle time

 

“Krisp creates wins for our customers, company, and team members,” said Travis Markel, COO of arrivia. “It enables higher-quality interactions, unlocks new opportunities for our agents, and drives real productivity gains. Arrivia benefits from delivering service that exceeds expectations while strengthening connections between our global team and the customers we serve. Our partnership with Krisp has truly elevated our ability to deliver exceptional value.”

 

AI Accent Conversion v3 is available now and works seamlessly with all UCaaS and CCaaS platforms. 

 

For more information, visit https://krisp.ai/accent-conversion/ 

The post Krisp Launches Patented AI Accent Conversion v3, Setting a New Standard in Accent Technology appeared first on Krisp.

]]>
https://krisp.ai/blog/krisp-launches-patented-ai-accent-conversion-v3/feed/ 0
Krisp AI Accent Conversion v3 https://krisp.ai/blog/krisp-ai-accent-conversion-v3/ https://krisp.ai/blog/krisp-ai-accent-conversion-v3/#respond Tue, 25 Mar 2025 13:53:26 +0000 https://krisp.ai/blog/?p=21179 Real-time, enrollment-free, and dramatically more natural.   We’re excited to announce the launch of v3 of Accent Conversion technology, a major technical advancement in Krisp’s voice AI stack. This version delivers significant improvements in voice naturalness, phoneme-level precision, and speaker adaptation—all while maintaining our commitment to security, scalability, and ease of use. A leap in […]

The post Krisp AI Accent Conversion v3 appeared first on Krisp.

]]>
Real-time, enrollment-free, and dramatically more natural.

 

We’re excited to announce the launch of v3 of Accent Conversion technology, a major technical advancement in Krisp’s voice AI stack. This version delivers significant improvements in voice naturalness, phoneme-level precision, and speaker adaptation—all while maintaining our commitment to security, scalability, and ease of use.

A leap in performance, now patented

With v3, Krisp’s Accent Conversion has reached a new level of technical sophistication—and is now officially patented. This milestone validates our unique approach to accent transformation: one that preserves speaker identity, operates in real time, and avoids the privacy pitfalls of voice cloning or enrollment-based models.

Accent barriers have long created friction in global contact centers, leading to miscommunication, increased call handle times, and hiring limitations. Krisp’s patented technology instantly solves these challenges, making conversations more natural and bias-free.

 

“With v3, we’ve dramatically improved one of the toughest challenges in voice AI—adjusting an accent while keeping the speakers voice and natural tone intact,” said Davit Baghdasaryan, CEO and Co-Founder at Krisp. “Now backed by patents, it brings us closer to voice technology that feels human and helps global support teams build stronger connections with customers.”

Hear the difference: v2 vs v3

To make the technical improvements tangible, we’ve created recordings using Indian and Filipino English speakers across the current and previous versions of Accent Conversion.

Agent Original speech AC V2 AC V3
Mandovi (IN accent)
Sherwin (IN accent)
Princess (PH accent)
Louis (PH accent)

The leap in quality is clear. Where v2 introduced early gains in clarity, v3 offers a step-change in naturalness—with smoother tone and intonation, fewer sound artifacts, and more accurate voice preservation.

Real-time voice adaptation—no setup required

One of the notable enhancements in v3 is its zero-shot deployment and usage. The technology doesn’t require:

 

  • Voice enrollment by the agents
  • User training
  • Configuration or tuning

 

Simply connect your headset, enable Accent Conversion, and start speaking. Voice adaptation is instantaneous and speaker-agnostic—meaning if another person picks up the headset, the system automatically recalibrates in real time, with no drop in quality or smoothness of the flow.

 

This level of usability sets a new standard for production-ready, plug-and-play technology.

Privacy by design: no voice data stored

Accent Conversion v3 is built with privacy and security at its core.

 

  • No voice embeddings are stored
  • No personal voice data is saved on-device or in the cloud
  • All processing happens on the spot, in real time

 

This design not only simplifies compliance with all the stringent security protocols—it also eliminates security concerns related to voiceprints or long-term data storage.

Phoneme-level precision and naturalness

From a technical standpoint, v3 introduces marked improvements across core speech processing components, including:

 

  • Finer phoneme-to-phoneme mapping
  • Improved prosodic modeling
  • Substantially reduced roboticness and speech artifacts
  • Greater emotional and inflectional nuance retention

 

As a result, agents sound more human, more authentic, and much easier to understand—especially in high-stakes communication environments like customer support, where comprehension and clarity are non-negotiable.

 

Compared to v2, v3 demonstrates:

 

  • Significant gains in intelligibility
  • Sharper articulation of consonants and vowels
  • A more fluid and natural speech cadence

A global roadmap ahead

Accent Conversion v3 currently supports a growing list of accents, with new geographies on the horizon. Our roadmap includes:

 

  • Latin American English accent pack
  • South African English accent pack
  • Conversion to regional U.S. English accents

 

Our goal is to support truly global teams with technology that adapts to real-world linguistic diversity—while retaining technical performance and security across all environments.

 

Accent Conversion v3 is now available for integration into your Krisp-powered workflows.

Check out more here: https://krisp.ai/accent-conversion/

The post Krisp AI Accent Conversion v3 appeared first on Krisp.

]]>
https://krisp.ai/blog/krisp-ai-accent-conversion-v3/feed/ 0
5 Ways Accent Neutralization Software Enhances Call Center Efficiency https://krisp.ai/blog/5-ways-accent-neutralization-software-enhances-call-center-efficiency/ https://krisp.ai/blog/5-ways-accent-neutralization-software-enhances-call-center-efficiency/#respond Thu, 12 Sep 2024 10:50:54 +0000 https://krisp.ai/blog/?p=15113 Have you considered how accents affect communication in call centers? Now that businesses serve customers from various cultural and linguistic backgrounds, clear communication is key to providing excellent service. However, accents can sometimes confuse, leading to slower interactions and frustration for both agents and clients. Accent neutralization software is an effective solution, enabling agents to […]

The post 5 Ways Accent Neutralization Software Enhances Call Center Efficiency appeared first on Krisp.

]]>
Have you considered how accents affect communication in call centers? Now that businesses serve customers from various cultural and linguistic backgrounds, clear communication is key to providing excellent service. However, accents can sometimes confuse, leading to slower interactions and frustration for both agents and clients. Accent neutralization software is an effective solution, enabling agents to communicate more clearly regardless of their accents. 

This article will cover 5 ways this technology enhances call center efficiency, streamlines customer interactions, and improves overall productivity in your operations, helping you achieve smoother communication across the board.

Key Takeaways

 

  • Accent neutralization reduces misunderstandings and creates smoother interactions, leading to higher customer satisfaction and loyalty.
  • Agents can onboard faster with less emphasis on accent training, allowing call centers to save time and resources during the training process.
  • With clearer communication, agents resolve issues faster, leading to reduced call handling times, fewer escalations, and improved overall efficiency.
  • Accent neutralization software enables call centers to hire a diverse workforce and scale globally without communication barriers, improving operational flexibility and efficiency.
  • By reducing communication anxiety and improving performance, accent neutralization software boosts agent confidence, lowers stress, and increases job retention.
  • Implementing accent neutralization software leads to measurable improvements in productivity, training time, and customer satisfaction, ultimately contributing to a more cost-efficient and high-performing call center.

1. Improved Customer Experience 

Customer experience is a crucial factor for the success of any call center. Clear communication plays a significant role in ensuring smooth interactions between agents and customers; this is where accent-neutralization software makes a substantial difference. 

 

By reducing the barriers caused by accents, call centers can provide a more seamless customer experience. Here’s how accent neutralization software improves customer interactions:

Reduces Misunderstandings
  • Accents can often lead to miscommunication, causing customers to ask agents to repeat themselves or clarify points.
  • Accent neutralization ensures that agents’ speech is more understandable, leading to fewer misunderstandings and quicker conversations.
Smoother Interaction Flow
  • With clearer communication, conversations become more fluid, without repeating or clarifying information.
  • Customers feel more at ease, leading to better rapport and overall satisfaction with the service.
Increases Customer Trust and Loyalty
  • Customers who can easily understand agents are more likely to trust the service and feel valued.
  • A smooth communication experience encourages repeat interactions, which strengthens customer loyalty over time.
Reduces Call Handling Time
  • By minimizing the need for clarification, accent neutralization software helps agents resolve customer queries faster.
  • Reduced call handling times result in improved overall efficiency for the call center, leading to cost savings.
Enhances Customer Satisfaction
  • Ultimately, when customers feel understood and don’t have to struggle to communicate, they leave interactions feeling more satisfied.
  • A positive customer experience translates into higher customer satisfaction scores, boosting the call center’s reputation.

 

Using accent neutralization software creates a win-win scenario: agents can communicate with more confidence, and customers receive a higher level of service. Call centers can deliver a consistently better experience by removing communication barriers, leading to increased customer retention and positive word-of-mouth. 

 

2. Reduced Training Time for New Agents

 

Training new agents in call centers often involves lengthy sessions focused on communication skills, including accent neutralization and language clarity. However, with accent neutralization software, much of this training time can be significantly reduced, allowing new agents to be ready for live interactions faster. Here’s how it helps cut down on training time:

 

Eliminates the Need for Extensive Accent Training
  • Traditionally, agents need intensive training to modify or neutralize their accents to suit a diverse customer base better.
  • Accent neutralization software automatically adjusts speech, reducing the time and effort spent on this aspect of training.
Accelerates Onboarding
  • New agents can start handling calls sooner since they don’t need to go through accent-specific language courses.
  • This allows businesses to operationalize new hires more quickly, boosting productivity from the start.
Simplifies Communication Training
  • The software helps agents focus on other essential skills, such as customer service, technical knowledge, and issue resolution, rather than spending weeks on voice and pronunciation training.
  • As a result, communication training becomes more streamlined and effective.
Reduces Stress for New Agents
  • Many new hires feel pressured to speak clearly and modify their accents, which can add stress during the onboarding process.
  • Accent neutralization software alleviates this pressure, allowing agents to communicate confidently from day one.
Improves Training Consistency
  • With software standardizing speech across all agents, companies can ensure consistent communication quality without depending on the individual agent’s accent or speech clarity.
  • This leads to a more uniform customer experience regardless of who is on the call.

 

3. Increased Productivity and Call Resolution Rates 

One of the biggest challenges in call centers is maintaining high productivity while resolving customer issues efficiently. Accent neutralization software plays a vital role in boosting both productivity and call resolution rates by enabling smoother and faster communication. Here’s how it contributes to improved performance:

 

Faster First-Call Resolutions (FCR)
  • Accent neutralization ensures that agents are understood clearly by customers from the outset, reducing the chances of misunderstandings.
  • This results in quicker problem identification and resolution, minimizing the need for follow-up calls.
  • Higher FCR rates directly contribute to better customer satisfaction and more efficient call center operations.
Reduced Average Handling Time (AHT)
  • Miscommunications caused by accents can lead to longer call durations as agents and customers spend extra time clarifying details.
  • Accent neutralization software eliminates these back-and-forth exchanges, enabling agents to resolve issues faster.
  • Shorter call times allow agents to handle more calls throughout the day, improving overall productivity.
Fewer Call Escalations
  • When customers struggle to understand agents due to heavy accents, they may request to speak with a different representative or escalate the issue to a supervisor.
  • With accent neutralization software in place, customers can clearly understand the agent, leading to fewer escalations and quicker resolution at the agent level.
  • This reduces the workload on supervisors and leads to smoother call flow.
Enhanced Multitasking Capabilities
  • Clear communication allows agents to focus on other essential tasks during calls, such as accessing customer data or troubleshooting technical issues.
  • This increases the overall efficiency of the agent, allowing them to handle multiple responsibilities without the distraction of miscommunication due to accents.
  • With smoother interactions, agents can handle more complex queries with ease.

 

4. Boost in Global Scalability 

In an increasingly globalized world, call centers are often required to serve customers from different regions, speaking various languages and accents. Accent neutralization software enables call centers to overcome communication barriers, making global scalability more feasible and efficient. 

Here’s how it helps call centers expand their reach:

Supports a Diverse Workforce
  • With accent neutralization, call centers can hire agents from various linguistic and regional backgrounds without worrying about how their accents may affect customer communication.
  • This expands the potential hiring pool, allowing companies to recruit talent from anywhere in the world.
Enhances Communication with International Customers
  • Accent neutralization ensures that customers from different parts of the world can easily understand agents, regardless of their accents.
  • This fosters clearer communication, enhancing the customer experience across global markets and ensuring consistency.
Simplifies Global Expansion
  • When call centers can maintain effective communication across regions, expanding into new markets becomes less challenging.
  • Accent neutralization software allows companies to scale operations without the fear of accent-related communication barriers affecting performance.
Reduces Conversion Challenges
  • Traditionally, converting communication strategies for different regions can be costly and time-consuming.
  • Accent neutralization software removes the need for region-specific communication training, simplifying operations and cutting costs related to conversion efforts.
Increases Flexibility in Call Routing
  • Call centers can route calls to agents in different regions without worrying about accent-based misunderstandings.
  • This flexibility enables companies to optimize resource allocation, ensuring that customer support is available across various time zones and regions.

 

For example, Krisp’s Accent Conversion solution is designed to help businesses communicate clearly with a diverse, international customer base.

 

By automatically adjusting accents in real time, Krisp ensures that agents can be easily understood, no matter where they are located or which customer they serve. This powerful technology makes it easier for businesses to deliver consistent service quality across different regions, enhancing global scalability and operational efficiency.

 

 

  

 

 

5. Enhanced Agent Confidence and Job Satisfaction 

 

The success of a call center doesn’t just rely on technology but also the well-being and confidence of its agents. Accent neutralization software contributes to this by improving agents’ communication skills and easing the pressure of handling diverse customer interactions. Here’s how it enhances agent confidence and job satisfaction:

 

Reduces Communication Anxiety
  • Many agents feel self-conscious about their accents, worrying that customers may struggle to understand them.
  • Accent neutralization software helps agents speak clearly without needing to modify their natural accent, reducing anxiety and boosting confidence during calls.
Improves Job Performance
  • With clearer communication, agents experience fewer customer misunderstandings and complaints, leading to more positive interactions.
  • This success motivates agents, making them feel more capable and effective in their roles, directly impacting job satisfaction.
Encourages Professional Growth
  • As agents become more comfortable communicating with customers across different regions, they are more likely to embrace new challenges and responsibilities.
  • Enhanced communication skills foster a sense of professional development, helping agents feel more engaged and committed to their roles.
Lowers Stress Levels
  • When agents don’t have to worry about being misunderstood due to their accents, they can focus on resolving customer issues more efficiently.
  • This reduces stress levels, leading to a healthier work environment and reducing burnout rates in call centers.
Increases Job Retention
  • Confident and satisfied agents are more likely to stay with the company long term.
  • By providing tools like accent neutralization software, call centers can reduce turnover rates, saving on recruitment and training costs while maintaining a stable workforce.
Fosters Inclusivity and Diversity
  • With accent neutralization software, agents from various linguistic backgrounds feel more included and valued, knowing their accents won’t be a barrier to success.
  • This promotes a more inclusive work environment, where diversity is seen as an asset rather than a challenge.

 

Conclusion 

 

Accent neutralization software has become a crucial tool for modern call centers, helping improve efficiency and communication on multiple fronts. By reducing misunderstandings, enhancing customer interactions, and accelerating new agent training, this technology allows call centers to operate more smoothly and with greater productivity. 

 

Its ability to standardize communication across diverse teams ensures global scalability while empowering agents to work confidently, free from the fear of being misunderstood due to their accents. The result is a more inclusive, satisfied workforce and a better customer experience. 

 

Adopting accent neutralization software, like Krisp’s accent conversion solution, is a strategic investment for businesses seeking to optimize their operations and stay competitive in a globalized market. It streamlines processes and paves the way for more scalable and efficient customer service. 

FAQs

How does accent neutralization improve customer satisfaction?
It reduces the chances of miscommunication, which leads to smoother interactions, less frustration, and more satisfied customers.
Does accent neutralization software replace training for call center agents?
While it doesn’t entirely replace training, it significantly reduces the time and effort needed to train agents on accent reduction.
Can accent neutralization software help my call center expand internationally?
Yes, it ensures that your agents can communicate clearly with customers worldwide, enabling seamless global operations.
Is Krisp’s accent conversion software easy to integrate into existing call center systems?
Yes, Krisp’s software is designed to be user-friendly and integrates smoothly into most call center systems without disrupting operations.

The post 5 Ways Accent Neutralization Software Enhances Call Center Efficiency appeared first on Krisp.

]]>
https://krisp.ai/blog/5-ways-accent-neutralization-software-enhances-call-center-efficiency/feed/ 0