Automating PDF Accessibility at Scale with Equidox AI

Equidox AI automates PDF accessibility and simplifies the process of digital compliance to a few clicks. Learn how this solution can result in substantial reductions in time and costs.

Video transcript

[Paul Campbell] Good afternoon, everyone. Welcome  to our webinar for Equidox AI, which is a fully   automated PDF remediation solution. We are very  excited about this cutting-edge technology that   is solving PDF remediation challenges at scale. By  way of introduction, my name is Paul Campbell and   I will be joined today by Dan Tuleta for the next  30 minutes. Quick logistics, if you have questions   during the webinar please drop them in the Q&A  chat button at the bottom of the screen and we'll   get back to you with an answer. Additionally,  this webinar will be recorded and will be sent   after the meeting in addition to the slide deck  and also a short survey, which we'd appreciate   you filling out. We're also happy to do a more  direct interactive session with you and other   team members if they were not available to join  and further our conversations as to how Equidox AI   may be a fit for your organization specifically.  Overview of the agenda today: first we're going to   talk about who is Equidox Software Company, where  we have been and where we are going, challenges   we have seen in the market, and our solution to  the problem. Then, Dan is going to talk about why   we make PDFs accessible, what's driving this, why  should people make PDF accessible, followed by an   overview of Equidox AI how it works, and finally,  a demonstration of our new bleeding-edge, powerful   technology. So a little bit about Equidox: we've  been around in existence for nearly a decade. Way   back when, a Canadian citizen was trying to apply  for a government job posting on the internet but   she was unfortunately unable to do so because of  her visual disability. She sued the government and   won her case. The government sought out a solution  at that point but they couldn't find one so they   asked the marketplace to respond and that's hence  why we as Equidox started to build a solution. And   we really haven't stopped innovating since. We  created a robust software as a service solution   and now hundreds of customers currently use our  SaaS product. The solution is world class and is   adding tremendous value to the marketplace  of digital accessibility for Enterprise   organizations, government organizations, and  educational institutions. Our customers love   the product, evidenced by the fact that nearly  100% of our customers renew their subscription   every year. That is all well and good, however,  we started hearing from organizations in the last   couple years that had tens of thousands, hundreds  of thousands, or even millions of documents that   needed to be remediated and there simply wasn't  an automated solution for that daunting need. The   traditional service providers were sending the  documents to India and other countries but there   was no efficient way to scale so companies,  unfortunately, were forced to settle for a   solution that is time consuming, expensive, and  doesn't truly mitigate the risk of a lawsuit from   what we found. So enter Equidox AI and why  we figured we wanted to solve this problem   in the market. Equidox AI is a fully automated PDF  remediation solution that removes the traditional,   manual remediation methods and auto-tagging  methods while increasing quality, accuracy,   and compliance. Equidox AI is utilized for use  cases where there are templated, recurring,   large volumes of documents where manual  remediation methods are just too cumbersome   and daunting to address. There are three main  challenges when it comes to PDF remediation   that we have found. Number one: costs. Companies  have multiple vendors. Outsource providers and   the and the investment internally for personnel  is very costly can really be a runaway train of   cost because of the industry standard price per  page can be exponential and the manual work to   this capacity and scale is extremely expensive.  Quality, number two. Because of the cumbersome,   manual processes discussed, auto-tagging mishaps  and and other issues multiplied by the volumes of   pages in scope and complexity leaves organizations  exposed to non-compliance and potential lawsuits   because of these elements. And the last one  is speed. Because of the demanding legal   requirements and quick turnaround times to get  accessible information to customers or employees   consistently, it's not realistic to accommodate  with traditional, manual processes and the   traditional auto tagging methods because of the  sheer volumes to manage when coupled with that quality. So we obviously wanted to create a better  way to solve for these challenges, and we have.   Our experts have found a way to truly automate  the PDF accessibility process for many use cases   involving high volumes of documentation in a  reoccurring way. Equidox AI automation allows,   number one, for good quality usability and  compliance every time because of our unique model   creation using machine learning and artificial  intelligence. However, we don't autotag, or cut   corners, or rely on any human element to get the  process done and not only will pass a checker but   will be fully usable and accessible to anyone with  a screen reader, which is very near and dear to   our hearts as a remediation organization.  Equidox AI automation also accommodates   aggressive timelines. Because we're relying on the  technology, we can dictate how fast the solution   runs and turn the dial up and down, so to speak,  to accommodate timelines that may be required   because we have flexibility in the infrastructure  and the compute that we apply to to satisfy those   timelines. And then lastly, Equidox AI automation  allows for lower costs and process improvements   and vendor consolidation where you don't have to  rely on multiple vendors doing a common task. You   can kind of have one organization to produce  this automated solution for your high volume,   templates, templated documents. So with that said,  now I'm going to turn it over to my colleague,   Dan, to talk about why it's important to  make PDFs accessible and how Equidox AI   really works and, finally, a demonstration. Dan? [Dan Tuleta] Great, thank you, Paul. Hi, everyone.   So yeah, let's talk a little bit about why we  are making PDFs accessible. So I assume that most   people on this call are at least somewhat familiar  with accessibility laws like the Americans   with Disabilities Act or Section 508 of The  Rehabilitation Act. I am not a lawyer so I'm not   going to go into the details of all of these laws  but just at a high level, there are requirements   for organizations to provide physical access like  wheelchair ramps, elevator, or elevators, Braille   signage, and organizations need to ensure that  their public facing digital content, including   PDFs, is accessible to everyone just the same as  they are to make sure that there's physical access   to their buildings and facilities. So ignoring  the accessibility of your digital content opens   up your organization to legal risks. There have  been thousands of organizations who learn this the   hard way when they've been sued for exactly this  type of problem, and there are thousands more who   quietly pay large settlements kind of quietly  and then they still have to ultimately go back   and fix their accessibility deficiencies. So long  story short, we live in a very digital world and   we rely so heavily on digital information.  So digital accessibility is not a fad and   it's not going away, so it's always good to be  aware of it and and address it in a proactive way. So for anyone who is unsure of why any of  this digital accessibility stuff matters in the   real world, people with disabilities use various  types of assistive technologies to interact with   digital content like PDFs. A very common type of  assistive technology is called a screen reader,   which is capable of reading digital  content like websites, applications,   and documents. Screen readers use digital  tags to navigate documents, and these tags   need to be properly encoded into the document  to organize the content and make it compatible   with the screen reader. So think of the tags as a  framework of the document which gives the screen   reader the ability to navigate and interact with  all of the various elements in the PDF. Equidox,   in cooperation with the National Federation  of the Blind, surveyed about 250 blind or low   vision individuals who rely on screen readers to  interact with their PDFs on a daily basis. Based   on this survey, we found that at least two-thirds  of PDF documents are inaccessible to people with   disabilities. So if you put yourself in the shoes  of a blind person, you can quickly imagine how   frustrated you would be if you could not read  two-thirds of the documents that you came into   contact with on a daily basis. On top of that,  imagine the potential privacy issues that there   would be if you had to ask your neighbor or  your friend to help you read private documents   like banking or investment statements, an  invoice, a pay stub, or insurance policy documents. So just to further emphasize  the points that I was making a couple of   slides back, here is just some additional  information about the volume and types   of lawsuits that organizations have faced  and will continue to face moving forward,   and just to reiterate the digital accessibility  requirements that organizations must adhere to.   They are not going away and there will continue  to be an increase in the attention that is paid   to it by state and federal mandates, the  Department of Justice, disability advocacy   groups, and individuals who simply want  to just be able to access their critical information. So one of the main challenges around  PDF accessibility is that PDF documents, each one   of them, is unique. We have heard a lot of empty  promises over the years of fully automating PDF   accessibility, but there are so many things about  PDFs that require human interpretation to decide   how to tag specific elements within the content. I  have been working in the PDF accessibility market   for over seven years and I have seen a lot of  organizations assume that they have accessible   documents because their documents have some  tags in them. But they quickly learn that   they are not usable, nor are they compliant,  and they are still open to litigation. So I   always tell people to beware of quote unquote  auto tagging technology masked as a solution to   fully automate PDF accessibility. These auto  taggers that are floating around out there,   they're capable of putting tags on the page but  there will always be accuracy issues and the   inaccuracy of these tags will lead to a lot of  confusion and frustration for the screen reader   user. Additionally, auto taggers can and will  leave organizations open to further litigation   because there is no guarantee of compliance with  WCAG standards. So even paying to outsource your   huge batches of documents to auto taggers, you're  still not mitigating your risk of litigation   because auto-tagging falls well short of true  compliance with accessibility standards. And then,   of course, the alternative of outsourcing  the remediation work to third parties   who are almost exclusively located overseas  introduces a mountain of data privacy issues,   and even if you can work around that with your use  case, the sheer volume is impossible to keep up   with. These outsourced remediation providers will  cut corners to do the bare minimum of work that   they need to make a document pass an automated  checker, but they're not actually making the   document compliant because it simply takes too  long to meet the deadlines at that type of volume.   So incorporating artificial intelligence, more  specifically computer vision and machine learning,   into high volume PDF remediation, this allows  our accessibility experts to train AI models to   accurately identify and tag all of the elements in  the document template. The use of AI developed by   our data scientists paired with the human element  of our trained accessibility experts allows for   incredibly accurate usable and compliant PDFs to  be returned to the customer in a fraction of the   time because AI works exponentially faster than  humans manually tagging each page. AI doesn't need   to take vacations. AI can work 24/7/365 without  breaks, and AI doesn't need to cut corners to   meet a deadline. It can just do it the right way.  So how does AI work? Our accessibility experts   use example documents of customer templates  to properly identify the various elements on   the page. These elements might include text and  paragraph structure, various levels of headings,   lists and tables, graphs and images, and of  course, the very important reading order of the   content. This training data that we accumulate  is then fed to the AI models to apply what it   has learned en masse to many thousands or even  millions of pages that have similar templates and formatting. Although the mechanics of how AI  technology works is rather abstract and more   complex than what I'm capable of showing  you here in a simple PowerPoint slide.   But here are a few examples of how we can  visualize the AI at work. So for example,   in this scatter plot each each of the green  dots represents a page in a PDF. They are   grouped together by the AI based on similarities  that the computer vision finds. So this cluster   will contain all of the pages that contain  pie charts, for example. In this example,   you can see there are different multicolumn  text layouts that the AI will use to recognize   different pages and group them together  appropriately so that it can apply the   correct tag structure. The AI will pick up on font  styles, sizes, and colors to help it establish the   tags on the page. We can even train AI models to  identify the many potential variations in tables,   such as the numbers of columns and  rows, table headers versus table data,   and even tables of different sizes that might  span across multiple pages. We'll talk a little   bit about this when we get into the demo as well.  The results of this extensive document analysis   and feeding the training data to the AI is we're  creating fully compliant PDF documents without   any human remediators who are, again, expensive to  employ or outsource to, and they are, of course,   liable to make human errors or be forced to cut  corners just to meet unattainable, unrealistic,   deadlines due to the crazy volume demands. We are  also reaching full compliance because this is not   auto-tagging, just sloppily throwing tags on a  page and saying that it's good enough. Beyond   compliance and passing automated checkers, the  bonus of using AI for high volume and hyper-fast   remediation is that it will produce incredibly  accurate and very much usable documents for people   with disabilities. So your customers who rely on  you on assistive technology, they're not going   to be filing complaints or lawsuits or calling  your headquarters to complain that their document   that you've given them cannot be navigated  or understood because they're using a screen reader. So we are just about ready to jump into a  demo. I promise the slides will end soon so. But   before we do I just want to make it clear that  the the underlying technology that we're talking   about here, this can be deployed in several ways  to align with your organization's requirements. So   first and foremost, what we're going to be seeing  during the demo is we've built an interface that   allows us internally here, and potentially  you if it's the right type of use case,   for this interface to run the process from start  to finish. So basically doing bulk uploads of   documents, running the batch process, and then  downloading the finished PDF. We can also take   the technology and use, and this is kind of like  what we ultimately envision for this technology,   we can embed the AI models into an existing  document creation and delivery system through   the use of APIs. So this would be probably  critical for customers needing to download their   private documents like a monthly statement, or an  explanation of benefits, or medical test results,   or investment portfolio type of reports and  status updates. Those types of documents that   are produced en masse but contain private and  sensitive information. Lastly, Equidox can operate   the entire process on your behalf as a managed  service. So we can take care of the remediation   as well as the validation to ensure that  everything exceeds all accessibility requirements,   and then we would deliver fully compliant, fully  compliant PDFs back to your organization to then   be posted and distributed. So we'll talk a little  bit about that in one of the demonstrations as   well. Just one more thing to note, Equidox  AI is tagging the PDFs at what we call the   post-processing stage. So, and you'll see this  during the demonstration, and what I mean by that   is these PDFs have already been created and we are  applying the accessibility as a final step before   they are publicly distributed. The advantage of  of tagging PDFs post-processing is that we do   not have to disrupt or completely rebuild your  document creation process, which is probably   fully established and has been sort of vetted out  by your organization over a long period of time,   and it wouldn't be ideal to have to completely  redo that from scratch. So your designers and   your producers of mass documentation can continue  their process the way that they've been doing it,   and we will handle the accessibility component  at the very end of the creation stage, but right   before the document reaches your customer. Okay,  so what we'll do, we're going to jump into the   demonstration and so I'm going to leave the slide  deck for just a minute and I'm going to switch   over to our batch interface. So again, this is  an interface that we have built pretty much just   for demonstrations to help people visualize like  what the technology is actually doing. But again,   this technology can be deployed in a number of  ways to kind of align with your specific use case   and any internal requirements like around security  or integration that your that your company or your   organization would have. So what we'll do to get  started is I'm going to go to the upload documents   tab here on the interface and then I'm just  going to open up the folders on my hard drive.   What I'm going to do first is I'm going to grab  a batch of financial statements. So this is just   a simple .zip folder that contains, I can't even  remember, 20 or 30 sample financial statements.   So we're just going to use this for sort of a  small scale example. If I drag and drop that   batch of documents into this, I can then press the  upload button. So I'm just going to give it a few   seconds to upload and once it uploads, it's going  to be available to have the AI models be applied   to those various documents. So if I now go to the  Create and Run Batch tab, I have a dropdown menu   to select. I have some different models that are  kind of pre loaded here into my own little private   demo account. So one of the models is called  “Example Statement” so we're going to use this   model to apply to those documents. Now I just have  to select the .zip radio button, and I'm going to   choose the financial statement .zip folder that  I just uploaded, and then I just press Run Batch.   Now this is going to kick off an automated process  where Equidox is going to first unpack that .zip   folder and it's going to identify all the various  elements within these different statements. Now   these statements are all relatively similar  to each other but they can have quite a few   variances. So just think about, like, what your  credit card bill might look like. You might have a   credit card bill one month that has just a single  page because you only used it a couple of times,   you only have a couple of charges. Then you  might have another month where maybe it's holiday   shopping season and you have 200 charges on that  credit card over the course of the month, and then   suddenly your bank statement or your credit card  bill is a lot different looking. It's got three,   four, five pages breaking down every single one  of those charges, usually in a table format.   So these are just some of the examples of like  where you can have differences even though the   documents are similar and are coming from really  the same source. So while I was talking there you   might have noticed these green lights just kind  of lighting up across the screen. Equidox, after   it unpacks the .zip folder, it will start applying  the machine learning zones based on what it knows   about this template. Once it finishes with the ML  zones, it's going to run this export process and   we can see these green status bars again lighting  up. And then once the documents are finished,   we get a Job Finish and a Job Success green light  and all of these documents are again available for   download. I can also I see some basic information  up here, like how much time was elapsed for that   process to run, how many documents ran, how many  total pages. We're not too concerned about that   right now, but we're really just going to look at  kind of the resulting PDF. So if, before we get   into the completed one, if I just like unzip this  for a second and let's just look at one of these   documents that we started with. These documents  were completely untagged so there's no tag   structure at all. This would be just a completely  useless page to someone who was blind. They would   not be able to read any of this information. They  would not be able to understand their deposits and   credits and withdrawals. All of this information  would be completely lost on them because this   document is not tagged at all, or if it were  tagged it was probably not tagged properly.   So what we do through that AI process is, if we  just take a look at one of these documents that   came out of the batch, and I'll download this and  I will put this on my desktop just so we know it's   the different one, and I'll open up my document  that I just created. Now this document here,   if you can tell underneath the accessibility  tags tab, this is completely different. We   have all of the elements accounted for on the  page. But not only are they accounted for, they   are accounted for correctly. So we have your bank  name and your customer name information up here,   we have a figure which would be, like, let's just  say the logo of the bank in this in this example   that we're using. So we can navigate through all  of the different content. We have our heading   structure, we have an H2, we have a table. The  table is properly tagged, and if you're not too   familiar with what the tag structures look like in  accessible PDFs, this is kind of the whole point,   that it is pretty complicated and it's very slow  and manual to set this up document by document.   So the use of automation and AI dramatically  simplifies this process and it totally takes   humans out of the equation. So because of that  training work that we did on templates like this,   our AI is able to fully understand and recognize  the differences between these different elements   and account for them in the tag tree in a fully  automated way. Another use case that I can quickly   explain would be a document. It's going to be  a totally different document. Let's go here and   I'll use this document. If I upload this one, this  document here, if we'll take a quick peek at it,   this is a totally different use case. So  we were just looking at a bank statement,   which is kind of applicable to invoices or a  pay stub or test results from on the healthcare   side of things. There's a lot of different use  cases that would use things like statements,   but this is an example here where we have what's  a a listing of a physicians directory. These are   thousands of pages long and they go on and on  and on and they are consistently updated. So   they're updated on a regular basis and that would  require the user to go back to the the document   and re-remediate it month after month, or quarter  after quarter, or year after year. On top of that,   those documents are also created in many different  languages, so depending on the market that that   the the physician's directory is located in,  they're typically being produced in multiple   languages. At least English and Spanish, sometimes  Chinese, sometimes Arabic, it just depends on the   the region of the country. But those physician’s  directories, the volume of them is impossible to   deal with, and they're actually quite complicated  documents. You have very complicated heading   structures throughout, you need to of course  make pages like this, and there's all kinds of   different things that span across multiple pages  that needs to be accounted for. There's just   simply no way for humans to be able to remediate  these at the volume of literally thousands and   thousands of pages that are constantly being  updated on a monthly basis. So this is a use   case that we're currently solving for. We have  customers that are dealing with, like I said,   literally millions of pages just like that across  their across their different networks. And so we   have to, we're accounting for that through the  use of AI because they're dissatisfied with the   results they were seeing from both auto-taggers as  well as outsourced human remediators. The accuracy   was terrible, the speed was too slow, and it  simply just wasn't good enough for what they   needed. So we're letting the machine learning kind  of do its thing here. This is a long document and   there's a lot of tags for it to kind of interpret  and apply. But what we'll see when we export the   document is very similar to what we saw when we  exported the bank statement. We are going to have   accurate tags where it's properly accounting  for the reading order, properly accounting for   the heading structure, all of the different text  elements are going to be identified. My little   private environment seems to be a little  bit sluggish today but we'll get there. Once it gets into the export  it should go rather quickly. Okay, so with that said, just in the interest  of time I realized that we're at 2:30 and   people might have to be dropping off. So while  while we're waiting for that, oh, there we go.   Perfect timing. It just finished. So if I open up  this document now, and again I'll open it up in   Adobe Acrobat, we remember that the original was  not tagged at all. But I just wanted to show you   that this one is properly tagged. So when you open  up the tag structure you can see that all of these   different elements are accounted for and they're  accounted for an accurate way, which is critical   because this is an extremely difficult document  to use if you're blind and these elements are   not tagged correctly. Just imagine if it reads  left to right across the three columns you would   have no idea what any of these doctors are, where  they're located, would be impossible to use this.   So that's what Equidox solves for. Now with that  said, I'm going to jump out of the demonstration   and we'll go back into the slide deck just to wrap  things up. We do have some articles that will be   listed in the slide deck, so when we share the  slide deck out if you'd like to learn a little   bit more about PDF accessibility please feel free  to browse through them or visit our website. And   just in conclusion, I just want to say thank you  to everyone for joining us here today. So we hope   you see the value and the capabilities of the new  technology. Please do not hesitate to reach out to   one of us for more of a one-on-one consultation so  that we can discuss your organization's unique use   cases and how Equidox AI can be applied to them.  And again we will be sending out the recording   of this webinar so please feel free to share  this with anyone in your organization. We will   include a link to the slide deck and for anyone  who asked a question during the Q&A feature,   we will get back to you as soon as possible.  And again there will be a short survey so if   you don't mind just taking a moment to fill that  out we would greatly appreciate it. So thank you   again everyone for joining and have a great  rest of your day. For more information about   how Equidox Software Company can help you with PDF  accessibility, email us at EquidoxSales@equidox.co   or give us a call at 216-529-3030 or  visit our website at www.Equidox.co.

Automating PDF Accessibility at Scale with Equidox AI

See how Equidox AI automates PDF accessibility and simplifies the process of digital compliance to a few clicks. Learn how this solution can substantially reduce time and cost compared to your current solution.

Webinar Agenda

  • Briefly review legal requirements
  • Learn how automated PDF accessibility works
  • View a live demonstration of Equidox AI, an automated PDF remediation solution
  • Before and after results of the automated tagging process

 

Download Presentation Deck

Envelope with green checkmark icon

Let’s talk!

Speak with an expert to learn how Equidox solutions make PDF accessibility easy.