JS Document Object Model
Explore, traverse and modify web pages with the power of JavaScript DOM properties and methods.
What is Document Object Model?
The Document Object Model (DOM) represents the structure of a web document and serves as its programming interface. In simpler words, it lets JavaScript talk with the HTML of the page. And that is huge, especially in the Marketing Automation world.
Why? Because it lets you manipulate your website in many ways - visible and not. Dynamic forms adapting to user actions, pages changing in real-time and enhanced data capture. All that (and so much more) is possible thanks to DOM access. So let's dive into details.
Accessing the DOM
To start working with Document Object Model in JavaScript, you need to use one of the special objects - window
(browser tab) or document
(page within that tab). Each of those objects offers many methods that let you interact with a webpage.
JavaScript lets you capture specific parts of your website and make them accessible for all your scripting needs with the help of document
object methods:
- getElementById: returns an element with matching id attribute (for example, elemnt with
id="emailAddressField"
usingdocument.getElementById('emailAddressField')
) - getElementsByTagName: returns a collection of matching HTML tags (for example, all
<p>
tags usingdocument.getElementByTagName('p')
) - getElementsByName: returns a collection of elements with matching name attribute (for example, all inputs with
name="email"
usingdocument.getElementByName('email')
) - getElementsByClassName: returns a collection of elements with matching class attribute (for example, all elements with
class="hiddenContent"
usingdocument.getElementByClassName('hiddenContent')
) - querySelector: returns an element with a matching CSS3 selector (for example,
<h2>
element that is within<p class="hiddenContent">
usingdocument.querySelector(p.hiddenContent h2')
- notice the.
prefix for a class). If there are multiple matching elements - it will return the first one. - querySelectorAll: returns a collection of elements matching CSS3 selector (for example, all
<tr>
elements that are within elements withclass=".DataCell"
that are within an element withid="attributes-repeater"
usingdocument.querySelectorAll('#attributes-repeater .DataCell tr')
)
querySelector
and querySelectorAll
allow you to select elements using the power of CSS3. It means that you can go really specific with the proper syntax.
You can select:
- Tag by its name:
document.querySelector('body')
- Id with
#
prefix:document.querySelector('#idName')
- Class with
.
prefix:document.querySelector('.className')
- Tag with specific class by chaining:
document.querySelector('div.className')
- Element with two classes:
document.querySelector('.className1.className2')
(notice lack of space between classes) - Element that is direct child of another element:
document.querySelector('div > h2')
(notice the>
symbol between tags) - Element that is any child another element:
document.querySelector('div h2')
- Element that is any sibling (has the same parent):
document.querySelector('p.className ~ h2')
(this will captureh2
that is under the same parent asp.className
) - Element that is adjecent sibling (has the same parent):
document.querySelector('p.className + h2')
(this will captureh2
that is under the same parent asp.className
and after that paragraph) - Element with specific attribute:
document.querySelector('[href]')
- Element with specific value of attribute:
document.querySelector('input[type="checkbox"]')
- One of the listed elements:
document.querySelector('ul, ol')
- All of the listed elements:
document.querySelectorAll('ul, ol')
- Elements targetet by pseudo-class:
document.querySelectorAll('a:visited')
- Pseudo-element:
document.querySelectorAll('h1::first-letter')
- Element with negated selection:
document.querySelector('.className:not(div)')
Sky is the limit, especially as you can chain all of the above into super query:
document.querySelector('#content > article > div:nth-child(20) a:nth-child(1) > code');
However, whenever possible, optimise. Either by finding a better way to select the element or by adding an easily selectable attribute to that element in the page HTML.
As you can see, there are many options you can leverage.
Should you use getElement or querySelector?
It depends on the purpose. The rule of thumb is that if one of the getElement
selectors can do the job, it's a better choice performance-wise (the longer HTML, the bigger difference). However, for complex selections, the newer querySelector
family might be better or the only possible choice.
// Instead of using three getElement selectors (that wouldn't even work in this example, more on that in the next paragraph)
document.getElementById('attributes-repeater').getElementsByClassName('DataCell').getElementsByTagName('tr');
// you should use a single querySelectorAll selector
document.querySelectorAll('#attributes-repeater .DataCell tr');
// There are also cases where querySelector is the only choice
document.querySelectorAll('input[type="checkbox"]');
As always, the devil lies in the details. While chained getElement
selectors will have better performance than clean querySelector
, there are three issues with the former:
- Readability: after two or three chained selectors, the
getElement
chain gets really hard to read and debug; the natural CSS3 style ofquerySelector
is straightforward. - Flexibility: you can only chain
getElements
on single element. Refering to the previous code sample, while the.getElementsByClassName('DataCell')
called on single outcome of.getElementById('attributes-repeater')
works, the collection it will return would crash the.getElementsByTagName('tr')
as the chaining works only on single element scope. - Loopability: outcomes of
getElements
selection cannot be looped usingforEach()
.
My approach is to use the more flexible and readable querySelector
whenever performance is not a dealbreaker and switch where possible to the getElement
toolset for use cases where the performance is crucial.
You can assign your DOM selection to a variable. It allows you to reuse it in multiple places of your script and lets you limit the scope of the selection.
You can do the latter by replacing the document
object with your variable - it will look for matching DOM elements only within the outcome of the previous selection.
const form = document.querySelector('form');
const divsInForm = form.querySelectorAll('div.legalNotice'); // Returns only div tags with legalNotice class that are within your form
It works like chaining selectors and with the same limitation - you can chain only if the previous outcome is a single element.
/* ✅ Chain from a single element to a collection - same outcome as previous code snippet */
document.querySelector('form').querySelectorAll('div.legalNotice');
/* ❌ Chain from a collection to a collection - will throw TypeError */
document.querySelectorAll('form').querySelectorAll('div.legalNotice');
/* ✅ Chain from a single element (thanks to index) to a collection */
document.querySelectorAll('form')[0].querySelectorAll('div.legalNotice');
/* Of course, in real scenario, you should use a compound selector for the same result */
document.querySelectorAll('form div.legalNotice');
Selecting elements is just the beginning. Once you pick them, you can explore, traverse and manipulate the DOM. You can check what is available for the selected element with console.dir(selectedElement)
in the developer console.
Exploring the DOM
Once you select a page element, you can learn more about it, thanks to properties. There is a long list of available features, so let's focus on the ones most useful in marketing automation and real-time personalisation world.
attributes
With .attributes
property you can list all HTML attributes on selected element in a NamedNodeMap. What is more, you can drill down on those details to get specific values:
/* <main id="content" class="main-content" role="main">…</main> */
document.querySelector('#content').attributes; // returns a Map with id, class and role
document.querySelector('#content').attributes.class.value; // returns 'main-content'
You can also list the names of all available attributes using .getAttributesNames()
. With getAttrubute()
method you can pull a value of a specific attribute. Finally, there is a pair of condition checking methods: hasAttributes()
that checks whether the selected element has any attribute and hasAttribute()
that tells you if the selected element has a specified attribute.
/* <main id="content" class="main-content" role="main">…</main> */
document.querySelector('#content').getAttributesNames(); // returns ['id', 'class', 'role']
document.querySelector('#content').getAttribute('role'); // returns 'main'
document.querySelector('#content').hasAttributes(); // returns true
document.querySelector('#content').hasAttribute('role'); // returns true
Those methods are helpful for non-standard attributes that don't have a dedicated shorthand.
classList and className
The .classList
property lets you directly list all classes assigned to the selected element (in the form of DOMTokenList). It is excellent when you want to loop through to find a specific class or manipulate the DOM.
On the other hand, when you want to do a simple check or condition, .className
is a great shorthand returning all classes as a string.
/* <div class="page-wrapper category-api document-page">…</div> */
document.querySelector('div.page-wrapper').classList; // returns an object with all the classes, length and value
document.querySelector('div.page-wrapper').classList.contains('document-page'); // returns true
document.querySelector('div.page-wrapper').classList.value; // returns 'page-wrapper category-api document-page'
document.querySelector('div.page-wrapper').className; // shorthand of the previous, returns 'page-wrapper category-api document-page'
The .classList
property is the bread and butter of page manipulation, as it allows you to add, remove, replace and toggle classes on an element. Think of hiding and displaying elements, changing the styles and other dynamic scenarios. More on that, in the changing attributes section.
id and tagName
Similarly to class-related properties, .id
property returns the value of the id attribute and .tagName
property outputs the selected tag's name.
/* <main id="content" class="main-content" role="main">…</main> */
document.querySelector('.main-content').id; // returns 'content'
document.querySelector('.main-content').tagName; // returns 'main'
Those two are less frequently used and mostly have some value when travelling through the DOM.
innerText and innerHTML
Another extremely important properties are .innerText
and .innerHTML
. They allow you to look into what is within the selected element.
.innerText
returns a plain text version of element content (including all child tags). Think copy-pasting fragment of the page into a chat.
document.querySelector('#innertext-and-innerhtml').innerText;
// returns ".innerText and .innerHTML"
.innerHTML
, on the other hand, will return a full-blown HTML code of the selected element (including all child tags within). However, keep in mind that it will be the rendered HTML, not the original HTML (so the version adapted to your screen, your device, your context).
document.querySelector('#innertext-and-innerhtml').innerHTML;
// returns ".innerText and .innerHTML<a class=\"hash-link\" href=\"#innertext-and-innerhtml\" title=\"Direct link to heading\"></a>"
While those are already useful for exploration,they shine when you want to manipulate your page. More on that later.
hidden and style
Finally, there are landing page must-haves: .hidden
and .style
. Those two properties describe the CSS of the selected element.
.hidden
is straightforward. It returns a boolean telling you whether the element is hidden from the frontend of the page.
.style
is much deeper, as it returns an object with all possible inline CSS declarations for the element. You can then drill down to return a value of a specific declaration.
document.querySelector('#hidden-and-style').hidden; // returns false
document.querySelector('#hidden-and-style').style.display; // returns ''
As with properties mentioned previously, .style
and .hidden
truly shine when manipulating the DOM.
Traversing the DOM
Think about the DOM as a complex hierarchy of elements. When you select a specific element, it is located somewhere within that hierarchy. And with the help of the properties, you can learn more about structure of that web and travel through it.
Before jumping to the how-to guide, let's first settle the DOM hierarchy naming convention.
<article class="main-page-content">
<h1>Main Header</h1>
<div>
<p>
<strong>Example paragraph</strong> with some written content and a <a href="https://mateuszdabrowski.pl">link</a>.
</p>
<p>
Yet another paragraph of this article.
</p>
</div>
</article>
Let's select the div
tag from the above structure using:
document.querySelector('article.main-page-content > div'); // selects <div>
There are three relationships between the selected element and the rest of the code above:
- The selected
div
tag is enclosed within<article class="main-page-content">
. The tag higher in the DOM hierarchy is called a parent. - The selected
div
tag is not alone within the<article class="main-page-content">
parent tag. There is alsoh1
. The tags at the same level of the DOM hierarchy are called siblings. - The selected
div
tag has twop
tags within itself. The tags lower in the DOM hierarchy are called children.
To sum up, our div
tag has an article
as a parent, h1
as a sibling, and two p
as children. Then, the first p
has two inline children: strong
and a
.
Let's leverage all this information.
parentElement
If you want to go up in the hierarchy from your selection, you can just use .parentElement
(or .parentNode
, which is nearly the same now).
document.querySelector('article.main-page-content > div').parentElement; // selects <article class="main-page-content">
previousElementSibling and nextElementSibling
For traversing the sibling elements, you can use either .previousElementSibling
or .nextElementSibling
to jump to the previous or next element. If there is no such element, you will get null
.
document.querySelector('article.main-page-content > div').previousElementSibling; // selects <h1>
document.querySelector('article.main-page-content > div').nextElementSibling; // returns null
You can also encounter similar properties: .previousSibling
and .nextSibling
. That pair is operating on HTML Nodes and will return more then you might expect. For example, whitespace between the elements (#text
node) or HTML comments. Unless you are sure you need it, .previousElementSibling
or .nextElementSibling
are better choice.
All things children
When you want to go down in the hierarchy, you can use .children
to get a collection of HTML elements.
document.querySelector('article.main-page-content > div').children; // returns collection of two <p> tags
You can either loop through those or pick a specific child with an index. Helpful here can be .childElementCount
, which will show you the number of elements selected.
document.querySelector('article.main-page-content > div').childElementCount; // returns 2
The nice thing is that for the most popular selections - the first and last child - you can use a clean shorthands .firstElementChild
and .lastElementChild
.
/* ❌ Unnecessary complex selection of the first and last child */
document.querySelector('article.main-page-content > div').children[0]; // selects first child
document.querySelector('article.main-page-content > div').children[document.querySelector('article.main-page-content > div').childElementCount - 1]; // selects last child
/* ✅ Optimised and readable selection of first and last child */
document.querySelector('article.main-page-content > div').firstElementChild; // selects first child
document.querySelector('article.main-page-content > div').lastElementChild; // selects last child
Like with sibling selection, here also you have set of similar properties - .childNodes
, .firstChild
, .lastChild
. All three work on Nodes, so those will pick up not only elements but also text (whitespace) and comments. Unless you are sure you need it, .children
, .firstElementChild
and .lastElementChild
are better choice.
The power of DOM traversing
Ok, we know how to traverse the DOM, but why should we? Because sometimes you have to deal with a dynamic DOM and with traversing you can still leverage all exploratory properties.
document.querySelector('article.main-page-content > div').parentElement.className; // returns "main-page-content"
You can also mix and match the traversing properties to jump multiple hierarchy levels.
document.querySelector('article.main-page-content > div').firstElementChild.lastElementChild.href // returns "https://mateuszdabrowski.pl/"
Finally, you are not limited by the need to know the exact path from the currently selected element to another one higher in the hierarchy that you are interested in. You can leverage the .closest
method to find it using the same CCS3 selection as with querySelector
.
document.querySelector('article.main-page-content > div').closest('.main-page-content'); // selects <article class="main-page-content">
Remember that .closest
can return the initially selected element if it fulfils the new selection. If you want to stop it from happening, you can just chain it after .parentElement
.
document.querySelector('article.main-page-content > div').parentElement.closest('.main-page-content'); // selects <article class="main-page-content">