puppeteer/README.md

378 lines
18 KiB
Markdown
Raw Normal View History

# Puppeteer
<!-- [START badges] -->
[![Linux Build Status](https://img.shields.io/travis/com/GoogleChrome/puppeteer/master.svg)](https://travis-ci.com/GoogleChrome/puppeteer) [![Windows Build Status](https://img.shields.io/appveyor/ci/aslushnikov/puppeteer/master.svg?logo=appveyor)](https://ci.appveyor.com/project/aslushnikov/puppeteer/branch/master) [![Build Status](https://api.cirrus-ci.com/github/GoogleChrome/puppeteer.svg)](https://cirrus-ci.com/github/GoogleChrome/puppeteer) [![NPM puppeteer package](https://img.shields.io/npm/v/puppeteer.svg)](https://npmjs.org/package/puppeteer)
<!-- [END badges] -->
<img src="https://user-images.githubusercontent.com/10379601/29446482-04f7036a-841f-11e7-9872-91d1fc2ea683.png" height="200" align="right">
2017-08-15 17:08:32 +00:00
2019-04-26 02:25:16 +00:00
###### [API](https://github.com/GoogleChrome/puppeteer/blob/v1.15.0/docs/api.md) | [FAQ](#faq) | [Contributing](https://github.com/GoogleChrome/puppeteer/blob/master/CONTRIBUTING.md) | [Troubleshooting](https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md)
2018-06-30 04:35:52 +00:00
> Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the [DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/). Puppeteer runs [headless](https://developers.google.com/web/updates/2017/04/headless-chrome) by default, but can be configured to run full (non-headless) Chrome or Chromium.
<!-- [START usecases] -->
2017-08-11 01:31:54 +00:00
###### What can I do?
Most things that you can do manually in the browser can be done using Puppeteer! Here are a few examples to get you started:
* Generate screenshots and PDFs of pages.
* Crawl a SPA (Single-Page Application) and generate pre-rendered content (i.e. "SSR" (Server-Side Rendering)).
2017-08-11 01:31:54 +00:00
* Automate form submission, UI testing, keyboard input, etc.
* Create an up-to-date, automated testing environment. Run your tests directly in the latest version of Chrome using the latest JavaScript and browser features.
* Capture a [timeline trace](https://developers.google.com/web/tools/chrome-devtools/evaluate-performance/reference) of your site to help diagnose performance issues.
* Test Chrome Extensions.
<!-- [END usecases] -->
2017-09-22 01:01:48 +00:00
Give it a spin: https://try-puppeteer.appspot.com/
<!-- [START getstarted] -->
2017-08-11 01:31:54 +00:00
## Getting Started
2017-07-31 22:15:43 +00:00
2017-08-11 01:31:54 +00:00
### Installation
2017-05-11 07:06:41 +00:00
2017-08-11 01:31:54 +00:00
To use Puppeteer in your project, run:
```bash
npm i puppeteer
# or "yarn add puppeteer"
2017-06-20 02:17:11 +00:00
```
2019-04-26 02:25:16 +00:00
Note: When you install Puppeteer, it downloads a recent version of Chromium (~170MB Mac, ~282MB Linux, ~280MB Win) that is guaranteed to work with the API. To skip the download, see [Environment variables](https://github.com/GoogleChrome/puppeteer/blob/v1.15.0/docs/api.md#environment-variables).
2018-08-10 02:31:14 +00:00
### puppeteer-core
Since version 1.7.0 we publish the [`puppeteer-core`](https://www.npmjs.com/package/puppeteer-core) package,
a version of Puppeteer that doesn't download Chromium by default.
```bash
npm i puppeteer-core
# or "yarn add puppeteer-core"
2018-08-10 02:31:14 +00:00
```
`puppeteer-core` is intended to be a lightweight version of Puppeteer for launching an existing browser installation or for connecting to a remote one. Be sure that the version of puppeteer-core you install is compatible with the
browser you intend to connect to.
2018-08-10 02:31:14 +00:00
See [puppeteer vs puppeteer-core](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#puppeteer-vs-puppeteer-core).
2017-08-11 01:31:54 +00:00
### Usage
2018-07-12 00:07:27 +00:00
Note: Puppeteer requires at least Node v6.4.0, but the examples below use async/await which is only supported in Node v7.6.0 or greater.
2017-08-20 19:43:15 +00:00
Puppeteer will be familiar to people using other browser testing frameworks. You create an instance
2019-04-26 02:25:16 +00:00
of `Browser`, open pages, and then manipulate them with [Puppeteer's API](https://github.com/GoogleChrome/puppeteer/blob/v1.15.0/docs/api.md#).
2017-05-11 07:06:41 +00:00
2017-08-11 01:31:54 +00:00
**Example** - navigating to https://example.com and saving a screenshot as *example.png*:
2017-05-11 07:06:41 +00:00
Save file as **example.js**
```js
const puppeteer = require('puppeteer');
2017-08-18 02:54:51 +00:00
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({path: 'example.png'});
2017-08-11 01:31:54 +00:00
await browser.close();
2017-08-11 01:31:54 +00:00
})();
```
Execute script on the command line
```bash
node example.js
```
2019-04-26 02:25:16 +00:00
Puppeteer sets an initial page size to 800px x 600px, which defines the screenshot size. The page size can be customized with [`Page.setViewport()`](https://github.com/GoogleChrome/puppeteer/blob/v1.15.0/docs/api.md#pagesetviewportviewport).
2017-08-11 01:31:54 +00:00
**Example** - create a PDF.
Save file as **hn.js**
2017-08-11 01:31:54 +00:00
```js
const puppeteer = require('puppeteer');
2017-08-11 01:31:54 +00:00
2017-08-18 02:54:51 +00:00
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://news.ycombinator.com', {waitUntil: 'networkidle2'});
2017-08-18 02:54:51 +00:00
await page.pdf({path: 'hn.pdf', format: 'A4'});
2017-08-11 01:31:54 +00:00
await browser.close();
2017-08-11 01:31:54 +00:00
})();
```
Execute script on the command line
```bash
node hn.js
```
2019-04-26 02:25:16 +00:00
See [`Page.pdf()`](https://github.com/GoogleChrome/puppeteer/blob/v1.15.0/docs/api.md#pagepdfoptions) for more information about creating pdfs.
2017-08-11 01:31:54 +00:00
**Example** - evaluate script in the context of the page
Save file as **get-dimensions.js**
```js
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
// Get the "viewport" of the page, as reported by the page.
const dimensions = await page.evaluate(() => {
return {
width: document.documentElement.clientWidth,
height: document.documentElement.clientHeight,
deviceScaleFactor: window.devicePixelRatio
};
});
console.log('Dimensions:', dimensions);
await browser.close();
})();
```
Execute script on the command line
```bash
node get-dimensions.js
```
2019-04-26 02:25:16 +00:00
See [`Page.evaluate()`](https://github.com/GoogleChrome/puppeteer/blob/v1.15.0/docs/api.md#pageevaluatepagefunction-args) for more information on `evaluate` and related methods like `evaluateOnNewDocument` and `exposeFunction`.
<!-- [END getstarted] -->
<!-- [START runtimesettings] -->
2017-08-11 01:31:54 +00:00
## Default runtime settings
**1. Uses Headless mode**
2017-08-11 01:31:54 +00:00
2019-04-26 02:25:16 +00:00
Puppeteer launches Chromium in [headless mode](https://developers.google.com/web/updates/2017/04/headless-chrome). To launch a full version of Chromium, set the ['headless' option](https://github.com/GoogleChrome/puppeteer/blob/v1.15.0/docs/api.md#puppeteerlaunchoptions) when launching a browser:
2017-08-11 01:31:54 +00:00
```js
const browser = await puppeteer.launch({headless: false}); // default is true
2017-08-11 01:31:54 +00:00
```
**2. Runs a bundled version of Chromium**
2017-06-20 02:17:11 +00:00
By default, Puppeteer downloads and uses a specific version of Chromium so its API
is guaranteed to work out of the box. To use Puppeteer with a different version of Chrome or Chromium,
pass in the executable's path when creating a `Browser` instance:
2017-08-11 01:31:54 +00:00
```js
2017-08-15 17:12:55 +00:00
const browser = await puppeteer.launch({executablePath: '/path/to/Chrome'});
```
2019-04-26 02:25:16 +00:00
See [`Puppeteer.launch()`](https://github.com/GoogleChrome/puppeteer/blob/v1.15.0/docs/api.md#puppeteerlaunchoptions) for more information.
2018-11-05 18:58:21 +00:00
See [`this article`](https://www.howtogeek.com/202825/what%E2%80%99s-the-difference-between-chromium-and-chrome/) for a description of the differences between Chromium and Chrome. [`This article`](https://chromium.googlesource.com/chromium/src/+/master/docs/chromium_browser_vs_google_chrome.md) describes some differences for Linux users.
**3. Creates a fresh user profile**
2017-08-11 01:31:54 +00:00
Puppeteer creates its own Chromium user profile which it **cleans up on every run**.
2017-06-20 02:17:11 +00:00
<!-- [END runtimesettings] -->
## Resources
2017-06-20 02:17:11 +00:00
2019-04-26 02:25:16 +00:00
- [API Documentation](https://github.com/GoogleChrome/puppeteer/blob/v1.15.0/docs/api.md)
- [Examples](https://github.com/GoogleChrome/puppeteer/tree/master/examples/)
- [Community list of Puppeteer resources](https://github.com/transitive-bullshit/awesome-puppeteer)
2017-05-11 07:06:41 +00:00
<!-- [START debugging] -->
## Debugging tips
2017-08-18 23:49:58 +00:00
1. Turn off headless mode - sometimes it's useful to see what the browser is
displaying. Instead of launching in headless mode, launch a full version of
the browser using `headless: false`:
const browser = await puppeteer.launch({headless: false});
2. Slow it down - the `slowMo` option slows down Puppeteer operations by the
specified amount of milliseconds. It's another way to help see what's going on.
const browser = await puppeteer.launch({
headless: false,
slowMo: 250 // slow down by 250ms
});
3. Capture console output - You can listen for the `console` event.
This is also handy when debugging code in `page.evaluate()`:
page.on('console', msg => console.log('PAGE LOG:', msg.text()));
await page.evaluate(() => console.log(`url is ${location.href}`));
4. Stop test execution and use a debugger in browser
- Use `{devtools: true}` when launching Puppeteer:
`const browser = await puppeteer.launch({devtools: true});`
- Change default test timeout:
jest: `jest.setTimeout(100000);`
jasmine: `jasmine.DEFAULT_TIMEOUT_INTERVAL = 100000;`
mocha: `this.timeout(100000);` (don't forget to change test to use [function and not '=>'](https://stackoverflow.com/a/23492442))
- Add an evaluate statement with `debugger` inside / add `debugger` to an existing evaluate statement:
`await page.evaluate(() => {debugger;});`
The test will now stop executing in the above evaluate statement, and chromium will stop in debug mode.
5. Enable verbose logging - internal DevTools protocol traffic
will be logged via the [`debug`](https://github.com/visionmedia/debug) module under the `puppeteer` namespace.
# Basic verbose logging
env DEBUG="puppeteer:*" node script.js
# Protocol traffic can be rather noisy. This example filters out all Network domain messages
env DEBUG="puppeteer:*" env DEBUG_COLORS=true node script.js 2>&1 | grep -v '"Network'
6. Debug your Puppeteer (node) code easily, using [ndb](https://github.com/GoogleChromeLabs/ndb)
- `npm install -g ndb` (or even better, use [npx](https://github.com/zkat/npx)!)
- add a `debugger` to your Puppeteer (node) code
- add `ndb` (or `npx ndb`) before your test command. For example:
`ndb jest` or `ndb mocha` (or `npx ndb jest` / `npx ndb mocha`)
- debug your test inside chromium like a boss!
<!-- [END debugging] -->
2017-07-27 18:28:35 +00:00
## Contributing to Puppeteer
2017-05-11 07:06:41 +00:00
Check out [contributing guide](https://github.com/GoogleChrome/puppeteer/blob/master/CONTRIBUTING.md) to get an overview of Puppeteer development.
2017-05-11 07:06:41 +00:00
<!-- [START faq] -->
# FAQ
2017-05-11 07:06:41 +00:00
#### Q: Who maintains Puppeteer?
The Chrome DevTools team maintains the library, but we'd love your help and expertise on the project!
See [Contributing](https://github.com/GoogleChrome/puppeteer/blob/master/CONTRIBUTING.md).
2018-07-05 06:13:26 +00:00
#### Q: What are Puppeteers goals and principles?
2018-07-05 06:13:26 +00:00
The goals of the project are:
2017-08-15 16:02:28 +00:00
- Provide a slim, canonical library that highlights the capabilities of the [DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/).
- Provide a reference implementation for similar testing libraries. Eventually, these other frameworks could adopt Puppeteer as their foundational layer.
- Grow the adoption of headless/automated browser testing.
- Help dogfood new DevTools Protocol features...and catch bugs!
- Learn more about the pain points of automated browser testing and help fill those gaps.
2017-08-16 01:08:19 +00:00
2018-07-05 06:13:26 +00:00
We adapt [Chromium principles](https://www.chromium.org/developers/core-principles) to help us drive product decisions:
- **Speed**: Puppeteer has almost zero performance overhead over an automated page.
- **Security**: Puppeteer operates off-process with respect to Chromium, making it safe to automate potentially malicious pages.
- **Stability**: Puppeteer should not be flaky and should not leak memory.
- **Simplicity**: Puppeteer provides a high-level API thats easy to use, understand, and debug.
#### Q: Is Puppeteer replacing Selenium/WebDriver?
**No**. Both projects are valuable for very different reasons:
- Selenium/WebDriver focuses on cross-browser automation; its value proposition is a single standard API that works across all major browsers.
- Puppeteer focuses on Chromium; its value proposition is richer functionality and higher reliability.
That said, you **can** use Puppeteer to run tests against Chromium, e.g. using the community-driven [jest-puppeteer](https://github.com/smooth-code/jest-puppeteer). While this probably shouldnt be your only testing solution, it does have a few good points compared to WebDriver:
2018-07-09 18:56:31 +00:00
- Puppeteer requires zero setup and comes bundled with the Chromium version it works best with, making it [very easy to start with](https://github.com/GoogleChrome/puppeteer/#getting-started). At the end of the day, its better to have a few tests running chromium-only, than no tests at all.
2018-07-05 06:13:26 +00:00
- Puppeteer has event-driven architecture, which removes a lot of potential flakiness. Theres no need for evil “sleep(1000)” calls in puppeteer scripts.
- Puppeteer runs headless by default, which makes it fast to run. Puppeteer v1.5.0 also exposes browser contexts, making it possible to efficiently parallelize test execution.
- Puppeteer shines when it comes to debugging: flip the “headless” bit to false, add “slowMo”, and youll see what the browser is doing. You can even open Chrome DevTools to inspect the test environment.
#### Q: Why doesnt Puppeteer v.XXX work with Chromium v.YYY?
We see Puppeteer as an **indivisible entity** with Chromium. Each version of Puppeteer bundles a specific version of Chromium **the only** version it is guaranteed to work with.
2018-07-05 06:13:26 +00:00
This is not an artificial constraint: A lot of work on Puppeteer is actually taking place in the Chromium repository. Heres a typical story:
- A Puppeteer bug is reported: https://github.com/GoogleChrome/puppeteer/issues/2709
- It turned out this is an issue with the DevTools protocol, so were fixing it in Chromium: https://chromium-review.googlesource.com/c/chromium/src/+/1102154
2018-07-05 06:13:26 +00:00
- Once the upstream fix is landed, we roll updated Chromium into Puppeteer: https://github.com/GoogleChrome/puppeteer/pull/2769
2017-08-16 01:08:19 +00:00
However, oftentimes it is desirable to use Puppeteer with the official Google Chrome rather than Chromium. For this to work, you should install a `puppeteer-core` version that corresponds to the Chrome version.
For example, in order to drive Chrome 71 with puppeteer-core, use `chrome-71` npm tag:
```bash
npm install puppeteer-core@chrome-71
```
2018-07-05 06:13:26 +00:00
#### Q: Which Chromium version does Puppeteer use?
Look for `chromium_revision` in [package.json](https://github.com/GoogleChrome/puppeteer/blob/master/package.json).
#### Q: Whats considered a “Navigation”?
From Puppeteers standpoint, **“navigation” is anything that changes a pages URL**.
Aside from regular navigation where the browser hits the network to fetch a new document from the web server, this includes [anchor navigations](https://www.w3.org/TR/html5/single-page.html#scroll-to-fragid) and [History API](https://developer.mozilla.org/en-US/docs/Web/API/History_API) usage.
With this definition of “navigation,” **Puppeteer works seamlessly with single-page applications.**
#### Q: Whats the difference between a “trusted" and "untrusted" input event?
In browsers, input events could be divided into two big groups: trusted vs. untrusted.
- **Trusted events**: events generated by users interacting with the page, e.g. using a mouse or keyboard.
- **Untrusted event**: events generated by Web APIs, e.g. `document.createEvent` or `element.click()` methods.
Websites can distinguish between these two groups:
- using an [`Event.isTrusted`](https://developer.mozilla.org/en-US/docs/Web/API/Event/isTrusted) event flag
- sniffing for accompanying events. For example, every trusted `'click'` event is preceded by `'mousedown'` and `'mouseup'` events.
For automation purposes its important to generate trusted events. **All input events generated with Puppeteer are trusted and fire proper accompanying events.** If, for some reason, one needs an untrusted event, its always possible to hop into a page context with `page.evaluate` and generate a fake event:
```js
await page.evaluate(() => {
document.querySelector('button[type=submit]').click();
});
```
#### Q: What features does Puppeteer not support?
You may find that Puppeteer does not behave as expected when controlling pages that incorporate audio and video. (For example, [video playback/screenshots is likely to fail](https://github.com/GoogleChrome/puppeteer/issues/291).) There are two reasons for this:
2019-04-26 02:25:16 +00:00
* Puppeteer is bundled with Chromium--not Chrome--and so by default, it inherits all of [Chromium's media-related limitations](https://www.chromium.org/audio-video). This means that Puppeteer does not support licensed formats such as AAC or H.264. (However, it is possible to force Puppeteer to use a separately-installed version Chrome instead of Chromium via the [`executablePath` option to `puppeteer.launch`](https://github.com/GoogleChrome/puppeteer/blob/v1.15.0/docs/api.md#puppeteerlaunchoptions). You should only use this configuration if you need an official release of Chrome that supports these media formats.)
* Since Puppeteer (in all configurations) controls a desktop version of Chromium/Chrome, features that are only supported by the mobile version of Chrome are not supported. This means that Puppeteer [does not support HTTP Live Streaming (HLS)](https://caniuse.com/#feat=http-live-streaming).
#### Q: I am having trouble installing / running Puppeteer in my test environment?
We have a [troubleshooting](https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md) guide for various operating systems that lists the required dependencies.
#### Q: How do I try/test a prerelease version of Puppeteer?
You can check out this repo or install the latest prerelease from npm:
```bash
npm i --save puppeteer@next
```
Please note that prerelease may be unstable and contain bugs.
#### Q: I have more questions! Where do I ask?
There are many ways to get help on Puppeteer:
- [bugtracker](https://github.com/GoogleChrome/puppeteer/issues)
- [stackoverflow](https://stackoverflow.com/questions/tagged/puppeteer)
- [slack channel](https://join.slack.com/t/puppeteer/shared_invite/enQtMzU4MjIyMDA5NTM4LTM1OTdkNDhlM2Y4ZGUzZDdjYjM5ZWZlZGFiZjc4MTkyYTVlYzIzYjU5NDIyNzgyMmFiNDFjN2UzNWU0N2ZhZDc)
Make sure to search these channels before posting your question.
<!-- [END faq] -->