Mastering PDF Accessibility

Why you need to make PDFs accessible, a PDF remediation project plan, and document workflows.

Video transcript

[Dan Tuleta] Okay, well, my  clock is showing two o'clock,   so I think it's a good time to get started.  welcome, everyone. My name is Dan Tuleta. I'm   the Senior Sales Engineer here over at Equidox. I  appreciate everyone joining us today for another   edition of our Equidox Webinar Wednesdays. Today, we're going to be talking about   mastering PDF accessibility, so let's get  into it. just quickly through the agenda   here of what we're going to be covering today  during the webinar. First, we'll obviously   introduce who we are as Equidox. Then we'll talk  a little bit about why we make PDFs accessible,   how to put together a plan for PDF accessibility,  and then some of the actual mechanics of the   PDF remediation workflow. And then I'll be  finishing things off with an Equidox demo.  So, anyone that has never seen our software  in action before, you'll be able to get a   peek at kind of what Equidox looks like from a  document remediation workflow perspective. So   it's a little bit different and obviously better  and faster than Adobe Acrobat for, for a similar   process of making PDF documents accessible. Okay, so first and foremost, Equidox Software   Company. Our mission statement is to enable  PDF accessibility through intelligent automated   solutions. We are an accessibility company  with a niche in the PDF document space,   so we are a leader in PDF accessibility.  We're probably most known for our flagship   Equidox Software as a service, so it's a web-based  application for making PDF documents accessible.  We also offer an AI solution for specific types  of use cases, which we're not going to really   get into in this, webinar. But, if you have any  use cases in mind where you have high volume,   sort of repetitive types of documents that follow  a similar formatting and structure, please let us   know. We definitely would love to chat with you  about that if you have any use cases in mind.  Okay, so let's talk about some of the market  challenges. PDFs are really one of the most   difficult document types to make compliant with  the Americans with Disabilities Act and all the   standards that fall under it—Section  508 and WCAG guidelines. Equidox makes   PDF accessibility easier and faster. We also put a real emphasis on quality   because Equidox is much easier to use than  a competitive tool like Adobe Acrobat. It   allows you to get through a larger volume  of pages and documents. It's easier to use,   so you can put the power of the tool into a much  larger group of users. You no longer have to silo   this to just a handful of resident experts in your  organization. You can really have everyone being   more responsible for the content that they create. Because you can go faster and you can deploy it to   a broader group of users, it really allows you to  put more of an emphasis on the quality. One of the   problems with PDFs is that organizations have so  many of them because they're being produced on a   daily basis, and in many cases, they've been being  produced for decades. Really, no one has ever   addressed it from an accessibility standpoint. There are massive backlogs of documents on top   of the documents that are being produced  on a day-to-day or week-to-week basis.   It's overwhelming for organizations, and that  leads them to having to outsource it overseas,   where you get very sloppy work that's being done  to just make it pass a checker. In some cases,   if you're doing it internally, people just  end up having to cut corners because they're   simply overwhelmed with the volume of content  that they have to work through. They don't   have the manpower; they don't have the  time in the day to work through all of   the backlogs and all of the constant flow of  documents that are coming across their desks.  Our ease of use and the speed that Equidox allows  you to remediate at allow you to really put an   emphasis on the quality to make sure that all  of the documents that you are publishing and   putting in a public-facing place on your website  or in any other sort of applications or LMS or   content management systems, you're always  going to be able to make sure that those   documents are not only compliant but also usable  for people that are using assistive technologies.  So, why do we make PDFs accessible?  Well, mainly you have to. It's part   of the Americans with Disabilities Act and  Section 508 of the Rehabilitation Act. Now,   I'm not personally a lawyer, so I'm not going to  explain all of the little finite details of the,   minute details of the, of the laws. But  we do have links in here that you're able   to access if you'd like to read up on these  different standards and guidelines and things   that you should be doing around PDF documents. So recently, there was a final ruling on digital   accessibility in 2024 under Title II of the ADA,  and it mandated that WCAG 2.1 AA standards must be   met for web content and mobile applications.  This, of course, includes the PDF documents   that are on your websites and applications. So, education institutions as well as state and   local governments both fall under Title 2. so it  really is something that you have to be aware of,   and you have to start managing, and make  sure you have a plan in place. There are   certain exceptions, just to mention  that underneath the Title II ruling,   there are certain exceptions. but in most  instances, PDFs still need to be made accessible.  So, a general rule of thumb is that if the PDF  document is still actively in use or relevant to   people, if it's on your public-facing website,  it likely needs to be made accessible. So,   don't just assume that the exceptions always  apply to you because, in general, they will not.  Um, and the DOJ provided a fact sheet citing  some of these examples where exceptions do not   apply. And here will be a link for the, for  all of those exceptions within that ruling.  So just, to help quantify the situation,  Equidox, in cooperation with the National   Federation of the Blind, we did a survey  with, several hundred blind and low-vision,   people that use assistive technology on a  day-to-day basis and what their experiences,   are. What their experiences have told us is that  at least two out of three PDFs are not accessible.  So, 67% of documents that are floating  around on the web that are in, you know,   active daily circulation are not accessible. So,  it presents a huge challenge to people that rely   on assistive technology, that are trying to  interact with those documents and extract the   same data that a sighted user would be able to. Um, but they simply cannot, they cannot have   that same experience because the documents  are not accessible, they're not compliant,   and they're not compatible with their assistive  technology. So, therefore, they are left to find   alternative ways to get that information,  which leads to all kinds of challenges.  Okay, so let's talk about creating a plan for  how we go about attacking PDF accessibility.   so there's definitely, you know, somewhat  of an order of operations. Now, this can   vary from organization to organization, and you  know, depending on the size of your organization,   depending on the size of the PDF problems that you  have, this can obviously, vary from org to org.  So, this is not exactly like a rigid  set of eight things that you must do,   but in general, a good order of operations  would be to assign a staff. You obviously need   personnel to help you address this problem. Um, you want to evaluate the scope. So,   how big is this problem? Do we have documents  that are 30 years old that we can, you know,   archive and just completely wipe off of the  website? Or do we have documents that always   need to be posted online, and there's really  no way that we could ever get rid of them?  Uh, are we talking about a couple of hundred  documents, a couple of thousand documents,   even tens of thousands or millions of  documents in some extreme examples? So,   understanding how large of a problem this  really is, is a good thing to do after you've   assigned a staff to help you evaluate that. Putting a written plan in place is always a   good thing to do. It's always good to show that  you, as an organization, are making a commitment   to this plan to address PDF accessibility. So,  getting that written plan and distributing it   within your organization, and, and start holding  people, you know, responsible and making sure   that everyone is following along with this plan. The next thing you want to do is prioritize the   documents. and I'll talk a little bit about that  on the next slide. but you know, it's important   to make sure that you're addressing, you know,  the most important things first and then the   least important things last. So, we'll talk a  little bit more about that on the next slide.  Um, you, of course, will want to choose a tool or  a vendor. so some organizations, depending on the   situation and all of the variables, they might  choose to outsource PDF remediation to a third   party. So, hiring a company to do it for you. Um, that has its advantages and disadvantages,   but that's one method of going. then you  can also, of course, choose a tool. So,   there are different PDF remediation tools out  there, the primary one being Adobe Acrobat.  So, Acrobat is kind of synonymous with PDFs;  it's what the majority of people rely on. but   there are alternatives, which we'll get into that  as well. Here we are in an Equidox webinar, so,   of course, we'll talk a little bit  about an alternative tool for this.  Then, of course, you need to remediate the  documents, which we'll talk more about,   and I'll show you how we go about remediating  PDFs. Then, of course, validate the documents,   just making sure that everything that you  did during that remediation process is   truly making the document compliant and usable. And then, of course, everything takes constant   maintenance, right? It's not like you just  get accessible and you're done forever. As   that website, as your applications,  as your documents evolve over time,   there's constantly new iterations of different  things that are going up onto the website.  So, you just want to make sure that once you've  gotten yourself caught up as an organization,   that you stay caught up. Don't get behind  the eight ball again. It requires that   constant maintenance to make sure that  you are, in fact, staying compliant.  Okay, so addressing PDF remediation projects.  one thing that's a good, you know, first step,   kind of, is to remove the outdated or  unnecessary files to help shrink down   that pile of documents that you need to remediate. You know, we've helped organizations, you know,   do some, like, sort of evaluation of their  documents in the past, and it's pretty interesting   what you'll find floating around on websites. You might have, a, you know, an invitation to   the company picnic from 1998 still posted on  the website or something like that. Do these   types of documents really need to, like, still  be living online? You know, old newsletters,   old memos that are completely obsolete. Um, there's a good chance that you have   many documents that are sitting on your website  presenting a compliance, sort of, issue for,   you know, litigation, future litigation, that  doesn't actually need to be posted online anymore.   So, it's a good thing to just kind of evaluate  your site and everything that's posted publicly   and remove the stuff that's not necessary. Uh, you also want to, evaluate the number of PDFs   that need to be remediated and their complexity.  So, that's a good thing to kind of understand,   like, how complicated are these documents to  remediate? Is it just simple text on a page?  Uh, are there more, more, more challenging  documents like fillable forms? Do we have OCR   content, stuff that is just, like,  scanned in and turned into a PDF?  Um, and then also decide what will be done  in-house and what will be outsourced. So,   if you do decide to go with, like, a mixed  approach of outsourcing some and doing some   in-house or vice versa, you can, you know,  make sure that you have your handle around,   like, what exactly you want to ship out overseas  versus what you would like to just keep in-house.  Okay, and I also mentioned we would talk a little  bit about how to go about prioritizing documents.  Um, just at a high level, our general  recommendation would be to start with   the content that is the most frequently used.  So, if you have, let's say, a document that is,   like, just front and center on your website, that  when someone comes to your website, there's a good   chance they're going to open that document up. Um, you really want to make sure that that PDF   is made accessible because that's going  to impact the most amount of people.  Um, the most recent content. So, if the stuff  that's the newest, the latest and greatest,   you, of course, want to put that as kind  of, like, a high-priority document to,   to make accessible because it's the  most up-to-date content on your site.  And then from there, you would probably work  towards the lowest complexity. So, what I mean   by that is the easiest stuff to remediate. The reason I would, I would put that rather,   like, as the third element on the list here,  in, of, like, putting it towards the end is,   because you, you want to be able to get  through the largest amount of pages possible,   in the shortest amount of time because  that, again, is going to be more impactful.  And then everything else, that would be your old  documents, your really, your really complicated   documents that are going to be more time-consuming  to remediate. I would put those towards the end   because those are probably going to have  the least impact on the majority of people.  So, that's kind of our general, like,  recommendation for how you go about approaching   this. but of course, there's always, you know,  variables that, that come into play for every   unique organization. So, this is not exactly,  like, the, a rigid set of rules, but this is our   general recommendation and a good place to start  when you're evaluating the scope of your project.  Okay, so when I go through the actual  demonstration today, you're going to   hear me talk a lot about, a lot about tags. Um, and just if anyone is not aware of what tags   are or how they work, all elements on a page  require a digital identifier known as a tag,   to be read by assistive technology. So, different elements on different PDFs are   going to be tagged in different ways. Some of the  primary tags that you'll see a lot of are things   like text, images, headings, links, lists, tables,  and of course, the reading order of all of the   content on the page is very important as well. So, so these different tags are a way of   organizing the information on the page to make  sure that the screen reader is going to navigate   this document and read the content in the same  way that a sighted user would read this page.  So, there are some general pre-flight document  observations that are good to take into account.  Um, first of all, how many pages are there in  this document? Is it a simple one-page document,   or is it a 500-page textbook? there's a  difference in how you might go about approaching   that document project depending on the size of it. How complex is it? Is it a very difficult fillable   form with 500, text input fields that need to  have tooltips written for them, or is it a simple,   just a couple of paragraphs on the page where  it's very simple to, to go about tagging?  Is the design and the formatting of  the document consistent throughout,   or is it some sort of, like, Frankenstein  document which was, you know, brought together   from three different sources and it was all  just strung together as one PDF document?  Uh, is there existing tag structure? So, if  the tags are there, are they worth keeping,   or are they tags that really you ought  to just get rid of and, and kind of throw   away and start over from scratch? It's always  case by case with PDFs. You don't always know.  Uh, does the document require OCR? So, OCR stands  for Optical Character Recognition. What that means   is, is the document just an image? And that might  be common if you have, like, a scanned document,   for example. Not all PDFs are created equally.  Sometimes a PDF is simply just a scan of a page.  So, as far as a machine is concerned, it's just an  image, and within that image, there might be text   that's readable to a sighted person, but you need  to use a process called OCR to extract that text   from the image so that it's actually readable and  usable by someone that's using a screen reader.  Um, are there form fields? So, fillable forms,  they take a little bit of extra work and, and sort   of, you know, TLC to make accessible. that's just  the nature of a fillable form. But, understanding   if the document is a form or just a standard,  like, plain PDF is, is important to know.  And also, are there images? Are the images  informative and, like, technical? You know,   do you have diagrams and charts and things that  are providing, like, technical information about   the document, or are they just decorative? You know, is it just repetitive logos? Is it,   just, you know, black and blue and red background  colors that are just there in the form of an   image, but it's really just there for, like,  the visual aesthetics of the document? It's,   again, always case by case, and these are  things to kind of take into consideration   when you're looking at a document that you're  about to, about to go through the remediation of.  Um, and then the workflow for PDF remediation. Uh, in general, I would recommend tagging all   of the text. as a general rule of  thumb, there are certain exceptions   to that, but tag all of the text. Set all of the headings. so there   can only be one heading level one, for example,  and we'll talk a little bit more about heading   structure when we get into the demonstration. Add alt text to the images and artifact that,   any of the images that are not needed. So,  again, using, like, the example of, like,   repetitive logos or just, you know, background  colors that are there for decoration purposes,   if those are images in the document, you can  artifact them so that the screen reader does   not have to read them, especially when  it's something that's just repetitive.  Uh, tagging the lists, tagging the tables,  making sure that all of the hyperlinks are   going to be going to the correct URL or the  correct destination wherever they're bound for.  Uh, set the reading order for  each page. That's, of course,   critical. Even if the tags are accurate,  the tags must be read in the correct order,   or else it can render the entire page useless. Imagine a three-column article where the screen   reader is just reading clear across the  three columns. So, you're going to hear   just a bunch of fragmented sentences and  words that together will not make any sense.  So, even if the tags are in the proper  spot, it would render the entire page   useless because it's all being read out of order. Uh, and then, of course, validate your work, and   we'll talk a little bit more on how to do that. And we're actually linking to a workflow blog,   in this slide deck, which we will share  out with you after the presentation.  Okay, so here is the slide that  we've arrived at where we will   embed the link to this demonstration. Uh, but I will jump out of here, out of my   PowerPoint, and I will jump into Equidox. So, here I am inside of Equidox, and,   just at a high level here, we have the  Equidox. We're operating in a browser,   so it's, something to mention that Equidox is not  a software that you need to install or update or   dedicate a license to on any individual machine. You can truly work from anywhere as long as   you have an internet connection. and we also  work with a concurrent user licensing model,   so it's important to understand,  like, kind of how the licensing works.  Let's just say your organization has  10 concurrent users. That means any   10 people within your organization could use  Equidox simultaneously. Maybe you have 20 or   30 or 50 people that can access Equidox;  it's just a matter of only 10 at a time.  So, you can deploy it to a much bigger group  of users without having to dedicate an annual   subscription to every potential user who might  only need to log in for 10 minutes once a week.  Um, so just something to consider. It's  a different approach than, than how we're   doing things, like with Adobe Acrobat, where  every person needs to have a license of that   software installed on their individual computer. And if you don't have that license installed,   then you just can't use Acrobat. Okay, so what I'll do here is   I'm going to go into this document. I  imported it just before we got started.  Um, I'm going to use this just to kind of walk  through, like, the basics of document remediation.  Um, inside of the document here, I can see that  I have a thumbnail for the one single page,   and if I click on that thumbnail, it  will take me into the remediation page.  And inside of the remediation  page here, the first thing   that I'm struck by are these yellow rectangles. And so, these yellow rectangles that I have here,   what they are representing  are essentially the tags.  So, just so everyone is aware, this document  was completely untagged to begin with.  So, if I were to open this up  inside of, like, Adobe Acrobat,   it's a completely untagged document. That means a screen reader wouldn't know what   to do with any of this information on this page. It's not accessible in any way, shape,   or form because it's completely untagged. Now, when I imported it into Equidox,   Equidox is smart enough to recognize that  there's a bunch of content on this page,   and it is putting these reading zones into the  places where it thinks makes the most sense.  So, it's effectively, like, auto-tagging  it just by importing the document.  However, we would say that you need to take  it much further than just auto-tagging it   because every PDF document is unique, right? Like, so this PDF document is different than   the next one that I'm going to  work on and the one after that.  So, it's not just like a universal  set of rules that we can just apply   an auto-tagging process to and hope for the best. There's always nuance and details that we need to   consider in each individual PDF document. So, we don't want to just import and   export and cross our fingers and  hope that it worked perfectly.  We definitely need to go through this  document and make sure that all of the   different elements are accounted for. Now, one button that's really important   inside of Equidox is this button right here,  which looks kind of like a computer monitor.  When you press this button, it will open up  a separate tab in your browser, and in this   browser preview, you can see an HTML rendering  of the page that you are currently working on.  Now, the reason that this HTML is really useful  is because this is essentially a representation   of how a screen reader would read this  page if we were to just stop working on   it and export it as it currently sits. So, it doesn't really take much of a   trained eye to see that there are  some pretty glaring issues here.  Sure, like, most of the paragraphs  and everything are going to be   read out loud, and that's perfectly fine. But we have images that we need to address.  We have a list right here that's  currently not set up as a list at all.  This is supposedly our table, so you  can see, like, what a mess the table is.  All of that data has no structure to it. You would be just giving a screen reader   user a bunch of random numbers that  would make no sense whatsoever.  So, we need to make sure that we're  addressing these types of issues.  Now, if I go back to the PDF, just to call  out another important feature within Equidox,   and it's a feature that I actually don't  really need to use on this document,   but it's this sensitivity slider. When I move this slider back and forth,   left and right, you'll see how these yellow  rectangles kind of, like, rearrange themselves.  And you can use that slider to kind of choose  your best possible starting point, which I really   already had when I arrived at the page. But if you want to move this around and   see if it can do any better, it just gives you  a chance to see if there's, like, a more optimal   place to begin your work from where you'll have  less work to do with the individual elements.  So, just for example, like, if you were to  bring this way over to the right, it takes   kind of like a 10,000-foot view of the page. And if you were to then go to your preview,   you'll see, like, it's kind of a problem here. This is, like, the wrong end of the spectrum.  We don't want all of those zones to just  be grouped together as one giant paragraph   because that's equally as incorrect, just,  you know, on the other end of the spectrum.  And then, alternatively, if you bring it all  the way down to zero, all of that content   is going to be removed from the preview,  and you're just left with the two images.  So, again, you really just want to find  kind of that sweet spot in the middle of   the page where everything is going to  give you a nice clean starting point.  Um, we Inox can reduce that to, I don't know,  how long does it take to hit L and nudge   the slider? Maybe 5 seconds. So you can imagine  how much, time you would save, you know,   over the course of just even one document. Um, you know, you're saving yourself probably   close to 15 minutes on this list right here. So you can see again that you have, like,   the exact same structure in this  list as you do from the PDF itself.  This element here is our table, which right  now is kind of a mess. You can probably tell   we don't want individual zones covering up  all of the different cells. That's what leads   us to this terrible-looking structure here,  where it's just a bunch of random numbers.  But we can, of course, fix that. So what I'm going to do is I'm just   going to put a zone right on top of that  entire table by clicking and dragging.  And then, if I hit T on my keyboard — T for table  — I can then double-click on the table zone.  And if I then look at just the table, now  isolating it inside of the table editor,   I can see these green grid lines here, which  you're free to drag around if you like.  But you can also use the table detector. So you're probably seeing the theme here   of these very easy-to-use sliders. When you  just nudge the sliders from left to right,   it will wake up the artificial intelligence,  and you can probably tell that these green   grid lines are now in the perfect locations. So everything is in line with its row and column.  If I go to the preview just after  making that quick change, you'll see   we've made a pretty dramatic difference. Instead of having whatever all this is,   we actually have something that  looks very similar to a table.  I do have two little things that I need to  correct, though, mainly the column headers here.  If I just hold Shift on my keyboard while I select  the cells, I can span across all of those cells.  That way, I'm not duplicating numbers,  and I'm making sure that 23 and 24   actually straddle the four sub-columns  that they are the column header for.  And also, this table has two  levels of column headers,   so I just want to hit my up arrow over here to  change the default one column header to two.  And if I were to go back to my preview,  what I'll find is now the second row,   which is represented by that bold font, is  also going to be tagged as a table header.  So all I have to do is save  the table and close out of it.  And all of those individual zones that I had  before have been overridden by the table,   so now I have just a single  table zone on top of everything.  All of the little individual things were removed  that I had there before, the individual TCH zones.  The images, not to spend too much time on  these because these images are very simple.  Just for this Equidox logo, I'm simply  going to just call it the Equidox logo   over here in the alt description field. And this is just really a similar rendering   of that logo, but it has a dog in the picture. I would argue that this is just a decorative   image, you know, it's there  to just kind of take up space.  So if I just simply hit, I can hit backspace  on my keyboard to get rid of that image.  So you can see the zone itself disappears, and if  you go to the HTML preview again, you'll see that   now we've kind of cleaned things up quite a bit. One thing that I'm noticing is that my footer   has somehow jumped in front of my table. The reason why is when I added that table zone,   it just naturally became the last zone on the  page, which is what this little number indicates.  So if I hit reorder and refresh the preview,  you'll see now I have my table in the proper spot,   and the footer comes at the very bottom. And so, just reordering the page, it will by   default reorder it in a top-to-bottom way. You also have different options for   multiple-column layouts as well, so it just  depends on what the page is calling for.  So now this document is fully accessible,  it's fully readable, it's fully compliant.  All I would need to do is go to the  export tab and hit generate PDF,   and it will produce this brand new document for  me, which is going to be fully tagged from start   to finish, just like I saw in my HTML preview. And if I were to open this up in my Acrobat   application, I'll be able to show  you the tags if Acrobat will wake up.  Of course, it will sleep on  me. Let's download it, I guess.  So this is the new one that I just created,  and you can see this document is fully tagged.  Alright, so, excuse me, wrong  X, want to hit this one.  So back to the slide deck, I'm  going to, we'll just wrap things   up here just in the interest of time. Let's see, we are on this one here.  So we have, we have a few links to relevant  articles here, just kind of more information,   more in-depth stuff based on  what we were talking about here.  And, that's going to conclude  our webinar for today.  So again, we are Equidox. You can reach out to us at   EquidoxSales@Equidox.co; that's  just like a general sales email   address if you'd like to reach out. Our phone number is 216-529-3030.  Of course, you can find us at www.Equidox.co,  Facebook, any of your preferred platforms.  And we are always happy to take individual calls  if you'd like to just reach out and have a deeper   discussion about your specific needs around PDF  remediation or to see a more personalized demo,   even using your own documents. We'd be happy to accommodate that,   so please don't hesitate to reach out. So thank you very much, everyone. Have   a great rest of your Wednesday afternoon. For more information about how Equidox Software   Company can help you with PDF accessibility,  email us at EquidoxSales@Equidox.co,   or give us a call at 216-529-3030 or  visit our website at www.Equidox.co.

Mastering PDF Accessibility

PDF accessibility doesn’t have to be hard.
In this 30-minute webinar, learn how to approach large PDF remediation projects, and workflows for tackling each PDF and its elements, and see how Equidox software makes PDF accessibility easy, even if you have no prior accessibility skills.

Agenda

  • Why make PDFs accessible?
  • A plan for PDF accessibility
  • PDF remediation workflow
  • Equidox demo

Mastering PDF Accessibility Slide Deck

Envelope with green checkmark icon

Let’s talk!

Speak with an expert to learn how Equidox solutions make PDF accessibility easy.