Scriptable Headless Browsers 101: PhantomJS vs. Headless Chrome/Chromium vs. Headless Firefox
Like cURL, but for executing JavaScript, handling external resources, and interacting with webpages programmatically.
Scriptable Headless Browsers 101: PhantomJS vs. Headless Chrome/Chromium vs. Headless Firefox
Articles in Professional Tools | By August R. Garcia
Published | Last Update
Like cURL, but for executing JavaScript, handling external resources, and interacting with webpages programmatically.
5,155 views, 3 RAMs, and 3 comments
- PhantomJS vs. Headless Chrome/Chromium vs. Headless Firefox
- Phantom JS
- Headless Chrome/Chromium
- Headless Firefox
- PhantomJS
- PhantomJS Example: Making "Not-Headless" Requests to Google Search Pages
- Setup and Installation
- Creat the Script
- Running the Script
- Running the Script as a "Reacharound"
- Other Notes and Observations
- Command-Line Flags
- User-Agent
- Headless Chrome/Chromium
- Installation and Setup
- Basic Headless Requests
- Get Page Source
- Get page as a PDF
- Other Notes and Observations
- Puppeteer
- Headless Firefox
- Installation and Setup
- Basic Headless Requests
- Other Observations
- Other Shit
- In Conclusion
While tools like cURL are useful, allowing easy methods to fetch raw HTML from webpages, they do not execute JavaScript. In fact, cURL only grabs the file located directly at the specified URL and will not download or load any other external files either, such as CSS sheets. For requests that can also render JavaScript and CSS, as well as generally interact with webpages, there are various other tools and libraries that can be used.
PhantomJS vs. Headless Chrome/Chromium vs. Headless Firefox
Phantom JS
- + Extremely straightforward and lightweight. Basically the same as writing vanilla JavaScript in any other context.
- + Long history and existed since at least 2011; large amounts of documentation, relevant forum threads, etc.
- - No longer being maintained
- - Environment may differ from "real" browsers, which can potentially result in discrepancies for testing.
Headless Chrome/Chromium
- + Best command-line interface out-of-the-box.
- + The
--dump-dom
flag makes integrating headless Chrome/Chromium requests with shell scripts easy; no helper/wrapper script files needed. - + Well maintained
- + Great documentation
- + Support for exporting/saving requests as PDFs with the
--print-to-pdf
flag - - Something something muh Google is evil
Headless Firefox
- + Mozilla is probably a less evil company than Google
- * Solid documentation
Overall:
- PhantomJS was basically "the" library for headless browser automation until recently, but is no longer maintained and the realease of headless FireFox and Chrome suggest that it is unlikely to come back to life
- Running headless instances of Chrome/Chromium seems to be the "best" option both in terms of out-of-the-box features and long-term support.
- Headless Firefox seems adequate, but inferior to working with headless Chrome, unless you specifically need to use Firefox.
PhantomJS
PhantomJS has been around since at least 2011 and is, basically, the first popularized headless, scriptable web browser. While it is no longer being maintained (as of 2018), it is still a solid package with good documentation.
PhantomJS Example: Making "Not-Headless" Requests to Google Search Pages
Setup and Installation
- Install PhantomJS. Running
sudo apt-get install phantomjs
to install PhantomJS worked fine on Ubuntu Linux. If you're using a different operating system, the installation may vary. See the PhantomJS installation page for more information. - Fuck around. If you want to familiarize yourself with PhantomJS further with basic examples, look at the quickstart guide, which is extremely straightforward.
Creat the Script
Create a new file called reacharound.js and paste the following code into it:
var page = require('webpage').create(),
system = require('system'),
t, address;
if (system.args.length === 1) {
console.log('Usage: loadspeed.js [some URL]');
phantom.exit();
}
address = system.args[1];
page.open(address, function(status) {
if (status !== 'success') {
console.log('FAIL to load the address');
} else {
console.log( page.content );
phantom.exit();
}
});
Then, save and exit the file.
Running the Script
To run the script, execute the following code:
phantomjs reacharound.js https://www.google.com/search?q=dog
Which will print out the page's code after loading and executing any JS.
Running the Script as a "Reacharound"
Since PhantomJS can easily be run from the terminal, the reacharound.js script above can also be easily piped between various BASH/shell scripts. For example, this code:
phantomjs reacharound.js https://www.google.com/search?q=dog > reacharound-js-output.html
Will load the same page, but save the result into a file. This could incorporated into other scripts, such as a search engine result monitoring script or local directory scraper.
Other Notes and Observations
Command-Line Flags
There are various command-line flags listed in this documentation file, which can simplify some use cases without the need to write code solutions directly to a script.
User-Agent
The default user-agent is:
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.1.1 Safari/538.1
And can be changed as specified in this settings documentation file.
Detecting PhantomJS
There is a quality article on how to--at least in theory--detect whether requests are from PhantomJS, as well as an associated presentation embedded below.
Whether there are any sites that actually use these methods is unclear.
Headless Chrome/Chromium
With PhantomJS no longer being maintained, headless Chrome is one of the most common solutions in Current Year, being released initially in 2017.
Installation and Setup
- Install Chrome and/or Chromium. Running
sudo apt-get install chromium-browser
orsudo apt-get install google-chrome
should work on Ubuntu Linux. - [Troubleshooting]. Open Chromium/Chrome at least once before attempting to launch it headless. When writing this article, I ran into some error with Chromium after installing it was first installed to this machine, which then fixed itself after launching Chromium once normally to trigger the first-launch "setup" page.
- More documentation. See the "getting started" guide from Google linked above.
Basic Headless Requests
Get Page Source
Here's an extremely basic example that grabs a page's source code, runs the JS, and then saves it to an HTML file. The "getting started" guide linked under "installation and setup" suggests that this command should be "google --headless...
" instead of "google-chrome --headless...
" or "chromium-browser --headless...
"; it may depend on your system.
google-chrome --headless --dump-dom https://www.google.com/search?q=dog > google-search-headless.html
Or:
chromium-browser --headless --dump-dom https://www.google.com/search?q=dog > google-search-headless.html
Get page as a PDF
google-chrome --headless --print-to-pdf https://www.google.com/search?q=dog
Other Notes and Observations
Puppeteer
For more advanced scripting and automation, note that there is a Node.JS library called Puppeteer that can be used to control headless Chrome programmatically. It can also/alternatively be combined with libraries tools like Selenium webdriver.
Headless Firefox
Headless Firefox is basically the same as headless Chrome, but for Firefox. It was also released in 2017.
Installation and Setup
Basic Headless Requests
firefox -headless --screenshot https://duckduckgo.com/?q=rare+clowns
Also see the list of command line options for other flags that can be used.
Other Observations
There are no other observations about headless Firefox. It works correctly. It's CLI has fewer "out of the box" tricks for common tasks than does headless Chrome. If you're using it for more complex testing or automation, you're probably combining it with something like Selenium.
Other Shit
- SlimerJS
- CasperJS
- Selenium WebDriver, which was used in this reCaptcha solver.
In Conclusion
Hell yeah. It's time to use some or all of these various libraries for purposes that are relevant to your use case.








Hash Brown (2 years ago)
yottabyte (2 years ago)
HipunkeriA (9 months ago) 🐏 ⨉ 3Posted by August R. Garcia 2 years ago
Edit History
• [2019-07-08 23:03 PDT] August R. Garcia (2 years ago)• [2019-07-08 23:03 PDT] August R. Garcia (2 years ago)
🕓 Posted at 08 July, 2019 23:03 PM PDT
- C U 7 Internet Marketing/SEO Tricks from the Slack Chat
- C U [Video] 28+ Methods to Make Money Online: The Ultimate Tier List from S-Tier through F-Tier
- C U Scriptable Headless Browsers 101: PhantomJS vs. Headless Chrome/Chromium vs. Headless Firefox
- C U [Video] 33 More SEO Questions Answered
- C U 5 Tasks You Didn't Know You Could do With Vim
- C U [Infographic] The Beginner's Vim Cheat Sheet
- C U 5 Tasks You Didn't Know Could be Done from the Developer Console
- C U A Review and Overview of the Charles Web Debugging Proxy Application
- C U SquirrelMail vs. RoundCube vs. Horde: Why does every web hosting company come with these three email clients?
- C U The Top 5 Ways to Make Money on the Internet (If You're Completely Incompetent)
August Garcia is some guy who used to sell Viagra on the Internet. He made this website to LARP as a sysadmin while posting about garbage like user-agent spoofing, spintax, the only good keyboard, virtual assitants from Pakistan, links with the rel="nofollow" attribute, proxies, sin, the developer console, literally every link building method, and other junk.
Available at arg@256kilobytes.com, via Twitter, or arg.256kilobytes.com. Open to business inquiries based on availability.
Of note is that (few, but not many!) 3rd parties can detect that there is a headless at play here:
https://antoinevastel.com/bot%20detection/2018/01/17/detect-chrome-headless-v2.html
http://geocar.sdf1.org/browser-verification.html
There are countermeasures that can help with varying levels of success.
August R. Garcia (2 years ago) 🐏 ⨉ 1Posted by jimdigriz 2 years ago 🕓 Posted at 09 July, 2019 08:23 AM PDT
Thenkle for this. I kiss Selenium every night before I go to sleep and pray to it in my dreams. Basically what I do is:
and then I add rest of the code thanks to my sleepless VA from Sachsenhausen.
This is brilliant sir.
"THAT DOG IS GETTING RAPED" - Terry A. Davis
Post a New Comment
To leave a comment, login to your account or create an account.
Do you like having a good time?
Register an Account
You can also login to an existing account or reset your password. All use of this site is subject to the terms of service and privacy policy.
Read Quality Articles
Read some quality articles. If you can manage to not get banned for like five minutes, you can even post your own articles.
View Articles →
Argue with People on the Internet
Use your account to explain why people are wrong on the Internet forum.
View Forum →
Vandalize the Wiki
Or don't. I'm not your dad.
View Wiki →
Ask and/or Answer Questions
If someone asks a terrible question, post a LMGTFY link.
View Answers →
Make Some Money
Hire freelancers and/or advertise your goods and/or services. Hire people directly. We're not a middleman or your dad. Manage your own business transactions.