Weighing your options when considering automatic speech recognition software
by Peter Lasensky
January 17, 2017

Anyone in the construction, engineering or architecture industries understands the importance of maintaining top-notch documentation throughout every step of a construction project. These industries are rife with potential for litigation, and airtight records are often your first—and sometimes only—line of defense against an expensive, potentially career-ending lawsuit.

With that in mind, many teams are depending on dictation and transcription software to capture notes at the moment an incident occurs, ensuring that they have up-to-date records. While these detailed records are preferable to the old method of scribbling illegible notes on whatever is available, or trying to accurately recall details of the incident at the end of a long workday, there are flaws to dictation software that should be noted.

Avoiding Error

Automatic speech recognition (ASR) software has certainly come a long way, but there are still some significant hurdles it has yet to overcome before it is capable of authentically and accurately replicating speech especially in the manner in which human transcription services are particularly adept at completing.

Within the last decade, ASR software’s error rate has declined from 80 percent to around 10 percent, according to a senior scientist at Microsoft. The average, commercially available transcription software usually clocks in around 12 percent. The advances have been significant, and software providers are continuing to make progress, though it is estimated to be decades before the machines catch up to humans. The error rate for human transcription is around 4 percent, making the error rate of ASR software more than double that of a human.

So, what’s the bottom line? People don’t naturally speak in text. They have accents, they use slang and they convey meaning with sound. All of these methods of communication can be difficult, if not impossible, for transcription software to translate. Factor in the noise on the average construction jobsite, and it is a recipe for a transcription rampant with inaccuracies.

Humans are capable of adapting to different methods of speech in a way that is still out of reach for even the most sophisticated technology. Also, be sure to take into account a human’s natural ability to filter out all of the noise to get to the gist of what a speaker is trying to convey.

Trying to force your jobsite team to speak in a way that is easier for the software to understand is just going to discourage them from using the software at all. These actions could expose you to potential litigation without the proper reporting to back you up.

Aiming for Accuracy

In most industries, accuracy is key. Mistakes can be costly and damaging to your hard-won credibility. Depending on ASR software is a recipe for impending disaster. If the software misunderstands an important measurement or other detail, it could potentially put your work behind, costing you time and money.

Additionally, because of the fast pace of the construction jobsite and the busy work schedules of most project managers, superintendents and other supervisors, many employees on the jobsite don’t have the time to spend checking their notes for mistakes after using ASR software. Consider the following issues related to ASR software:

  • Editing—All too often, there is editing that goes hand in hand with ASR software. You will definitely need to allow ample time to look over the notes, ensuring that nothing important is missing and that the report flows well. Most people who utilize ASR do so in a limited nature, keeping notes to a minimum number of words, more like tweets. In most cases, people don’t want to waste time editing notes so they limit the number of notes they take and the amount of content. This is exactly what you don’t want when you need notes that are accurate and that tell a story that can easily be understood. By the way, it is always helpful to have the audio file available for review, which is another authentication factor when needed to resolve disputes. If you don’t have time to spend poring over completed reports, searching for potentially costly errors, and you don’t have the best tool for risk mitigation, perhaps you should simply stick with the human touch.
  • Lack of intuition —Talk-to-text software doesn’t have the intuition that humans possess. We naturally fill in the blanks when someone is speaking, and we are capable of completely disregarding the “filler language” that so many people use when they speak in everyday conversation. ASR software will capture every “um” and “uh” you utter, while human transcriptionists will remove these from the text, ensuring that you are getting the meat of the topic, without any superfluous filler. When using transcription, you speak naturally, instead of saying “comma” or “period” after each sentence. Transcription also allows for directives, such as requesting a human transcriber to tag a note to a particular category, such as “change orders.”
  • Multiuser issues —While some transcription software can be programmed to recognize your voice and speech patterns, what happens when someone else takes up the daily dictation duties? It’s an all-new learning curve for the software, and do you have time to wait for your ASR software to catch up?
  • Security—When you record your reports using ASR software, as soon as the text is transcribed, the audio file is gone. Voice transcription services frequently save the recordings, so if there is an issue, you have the backup you may need down the road.

Your daily, jobsite reporting is too important to leave in the hands of error-ridden software. A human transcriber, particularly a transcriber well-versed in construction terminology, will help your team create vital, real-time reports that can be done anytime, throughout the workday, saving your company both time and money in the long run.