Equidox is the fastest PDF remediation software tool available. The Equidox development team is constantly working to improve its speed and accuracy and to approach a fully automated process. PDF remediation may never be fully automated. However, each step taken in that direction saves time, money and resources making PDF content accessible and usable. David Freelan is our artificial intelligence developer. He joined the team to further the automation of Equidox and is making great strides in improving the software.
Equidox’s AI Developer, David Freelan
Here’s what David has to say about Equidox’s artificial intelligence development:
“Currently, the existing, trademarked Equidox software does not use ‘visual information’ to understand content. One consequence of not looking at visual information is that scanned documents are one of the more difficult types of content for Equidox to detect.”
“Our machine learning technology aims to change this over time. Our goal is to continue to reduce the amount of time remediators spend tagging document elements.”
The First Step in Implementing Machine Learning
“Currently Equidox software, using its existing technology, has good success auto-detecting non-scanned text. Other visually structured and complex data, such as lists and tables, were impossible to detect automatically. The existing Equidox software has some of the fastest and easiest-to-use functionality for tagging these elements. Nevertheless, we felt that the implementation of machine learning could improve on these processes. Because these elements are some of the most time-consuming for remediators, tables were the initial focus for our machine learning technology.”
Machine Learning for the Table Editor
“We have successfully implemented a new and improved machine learning detection process for the table editor.”
“The new detector uses ten new settings: five for columns and five for rows. Each uses a different detection functionality to place row and column delimiters in order to tag the individual cells of the table. These work independently, so are useful for a wide variety of table formats. Some detect existing lines, some detect text, some detect spacing. By choosing one setting from the options for rows, and one setting from the options for columns, the machine learning detection process can more accurately pinpoint how to tag the content into cells.”
“When we compare the performance of our new table editor to other editors, not only are our results better than our existing table editor, it also outperforms Amazon’s table extractor, Textract. Some of our remediators have commented that the new table editor features have reduced the time spent on page-sized tables from 15-20 minutes using the original Equidox table editor, down to about two minutes with our new machine-learning technology.”
Planning for the Future of Equidox and AI
“The next item on the plate is to detect list items automatically, including lists that have a hierarchical structure. The idea is that you will be able to simply highlight the list, and the machine learning will give its best guess on where the list items, sublists, and delimiters are.”
“Once these more time-consuming tasks have been addressed and resolved, we will continue our improvements to scanned text detection. This will include elements such as headers and links, and finally text, including scanned text.”
“We currently have the ability to detect specific headings and text in repetitive, large volume documents as part of our batch processing technology. However, this technology is batch-specific and must be tailored for each type of document. It is effective for documents with predictable, repetitive elements. These might include bank statements, insurance claims summaries, utility statements, billing and customer data reports.”
“As our machine learning technology improves, we hope it will be able to also detect scanned text (our existing technology does a great job of auto-tagging digitally rendered text content), and take that final step to using artificial intelligence for near-total automation.”
To learn more about Equidox’s automation and use of artificial intelligence and machine learning, contact us for a free demonstration.
Tammy Albee | Content Marketer | Onix Tammy joined Onix after four years experience working at the National Federation of the Blind. She firmly maintains that accessibility is about reaching everyone, regardless of ability, and boosting your market share in the process. "Nobody should be barred from accessing information. It's what drives our modern society."