puppeteer/docs/index.md

262 lines
9.7 KiB
Markdown
Raw Normal View History

2022-07-01 11:52:39 +00:00
# Puppeteer
[![build](https://github.com/puppeteer/puppeteer/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/puppeteer/puppeteer/actions/workflows/ci.yml)
[![npm puppeteer package](https://img.shields.io/npm/v/puppeteer.svg)](https://npmjs.org/package/puppeteer)
2022-07-01 11:52:39 +00:00
<img src="https://user-images.githubusercontent.com/10379601/29446482-04f7036a-841f-11e7-9872-91d1fc2ea683.png" height="200" align="right"/>
#### [Guides](https://pptr.dev/category/guides) | [API](https://pptr.dev/api) | [FAQ](https://pptr.dev/faq) | [Contributing](https://pptr.dev/contributing) | [Troubleshooting](https://pptr.dev/troubleshooting)
2022-07-01 11:52:39 +00:00
> Puppeteer is a Node.js library which provides a high-level API to control
> Chrome/Chromium over the
> [DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/).
> Puppeteer runs in
> [headless](https://developer.chrome.com/docs/chromium/new-headless/)
> mode by default, but can be configured to run in full ("headful")
> Chrome/Chromium.
2022-07-01 11:52:39 +00:00
#### What can I do?
2022-07-01 11:52:39 +00:00
Most things that you can do manually in the browser can be done using Puppeteer!
Here are a few examples to get you started:
2022-07-01 11:52:39 +00:00
- Generate screenshots and PDFs of pages.
- Crawl a SPA (Single-Page Application) and generate pre-rendered content (i.e.
"SSR" (Server-Side Rendering)).
2022-07-01 11:52:39 +00:00
- Automate form submission, UI testing, keyboard input, etc.
- Create an automated testing environment using the latest JavaScript and
browser features.
- Capture a
[timeline trace](https://developer.chrome.com/docs/devtools/performance/reference)
of your site to help diagnose performance issues.
- [Test Chrome Extensions](https://pptr.dev/guides/chrome-extensions).
2022-07-01 11:52:39 +00:00
## Getting Started
### Installation
To use Puppeteer in your project, run:
```bash
npm i puppeteer
# or using yarn
yarn add puppeteer
# or using pnpm
pnpm i puppeteer
2022-07-01 11:52:39 +00:00
```
When you install Puppeteer, it automatically downloads a recent version of
[Chrome for Testing](https://developer.chrome.com/blog/chrome-for-testing/) (~170MB macOS, ~282MB Linux, ~280MB Windows) and a `chrome-headless-shell` binary (starting with Puppeteer v21.6.0) that is [guaranteed to
work](https://pptr.dev/faq#q-why-doesnt-puppeteer-vxxx-work-with-chromium-vyyy)
with Puppeteer. The browser is downloaded to the `$HOME/.cache/puppeteer` folder
by default (starting with Puppeteer v19.0.0). See [configuration](https://pptr.dev/api/puppeteer.configuration) for configuration options and environmental variables to control the download behavor.
If you deploy a project using Puppeteer to a hosting provider, such as Render or
Heroku, you might need to reconfigure the location of the cache to be within
your project folder (see an example below) because not all hosting providers
include `$HOME/.cache` into the project's deployment.
For a version of Puppeteer without the browser installation, see
[`puppeteer-core`](#puppeteer-core).
2022-07-01 11:52:39 +00:00
If used with TypeScript, the minimum supported TypeScript version is `4.7.4`.
#### Configuration
Puppeteer uses several defaults that can be customized through configuration
files.
For example, to change the default cache directory Puppeteer uses to install
browsers, you can add a `.puppeteerrc.cjs` (or `puppeteer.config.cjs`) at the
root of your application with the contents
```js
const {join} = require('path');
/**
* @type {import("puppeteer").Configuration}
*/
module.exports = {
// Changes the cache location for Puppeteer.
cacheDirectory: join(__dirname, '.cache', 'puppeteer'),
};
```
After adding the configuration file, you will need to remove and reinstall
`puppeteer` for it to take effect.
See the [configuration guide](https://pptr.dev/guides/configuration) for more
information.
2022-07-01 11:52:39 +00:00
#### `puppeteer-core`
2022-07-01 11:52:39 +00:00
2023-10-24 07:23:50 +00:00
For every release since v1.7.0 we publish two packages:
2022-07-01 11:52:39 +00:00
- [`puppeteer`](https://www.npmjs.com/package/puppeteer)
- [`puppeteer-core`](https://www.npmjs.com/package/puppeteer-core)
`puppeteer` is a _product_ for browser automation. When installed, it downloads
a version of Chrome, which it then drives using `puppeteer-core`. Being an
end-user product, `puppeteer` automates several workflows using reasonable
defaults [that can be customized](https://pptr.dev/guides/configuration).
2022-07-01 11:52:39 +00:00
`puppeteer-core` is a _library_ to help drive anything that supports DevTools
protocol. Being a library, `puppeteer-core` is fully driven through its
programmatic interface implying no defaults are assumed and `puppeteer-core`
will not download Chrome when installed.
You should use `puppeteer-core` if you are
[connecting to a remote browser](https://pptr.dev/api/puppeteer.puppeteer.connect)
or [managing browsers yourself](https://pptr.dev/browsers-api/).
If you are managing browsers yourself, you will need to call
[`puppeteer.launch`](https://pptr.dev/api/puppeteer.puppeteernode.launch) with
2023-10-24 07:23:50 +00:00
an explicit
2023-03-29 09:37:35 +00:00
[`executablePath`](https://pptr.dev/api/puppeteer.launchoptions)
(or [`channel`](https://pptr.dev/api/puppeteer.launchoptions) if it's
installed in a standard location).
2022-07-01 11:52:39 +00:00
When using `puppeteer-core`, remember to change the import:
2022-07-01 11:52:39 +00:00
```ts
import puppeteer from 'puppeteer-core';
2022-07-01 11:52:39 +00:00
```
### Usage
Puppeteer follows the latest
[maintenance LTS](https://github.com/nodejs/Release#release-schedule) version of
Node.
2022-07-01 11:52:39 +00:00
Puppeteer will be familiar to people using other browser testing frameworks. You
[launch](https://pptr.dev/api/puppeteer.puppeteernode.launch)/[connect](https://pptr.dev/api/puppeteer.puppeteernode.connect)
a [browser](https://pptr.dev/api/puppeteer.browser),
[create](https://pptr.dev/api/puppeteer.browser.newpage) some
[pages](https://pptr.dev/api/puppeteer.page), and then manipulate them with
[Puppeteer's API](https://pptr.dev/api).
2022-07-01 11:52:39 +00:00
For more in-depth usage, check our [guides](https://pptr.dev/category/guides)
and [examples](https://github.com/puppeteer/puppeteer/tree/main/examples).
2022-07-01 11:52:39 +00:00
#### Example
2022-07-01 11:52:39 +00:00
The following example searches [developer.chrome.com](https://developer.chrome.com/) for blog posts with text "automate beyond recorder", click on the first result and print the full title of the blog post.
2022-07-01 11:52:39 +00:00
```ts
import puppeteer from 'puppeteer';
2022-07-01 11:52:39 +00:00
(async () => {
// Launch the browser and open a new blank page
2022-07-01 11:52:39 +00:00
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Navigate the page to a URL
await page.goto('https://developer.chrome.com/');
2022-07-01 11:52:39 +00:00
// Set screen size
await page.setViewport({width: 1080, height: 1024});
2022-07-01 11:52:39 +00:00
// Type into search box
await page.type('.devsite-search-field', 'automate beyond recorder');
// Wait and click on first result
const searchResultSelector = '.devsite-result-item-link';
await page.waitForSelector(searchResultSelector);
await page.click(searchResultSelector);
// Locate the full title with a unique string
const textSelector = await page.waitForSelector(
'text/Customize and automate'
);
const fullTitle = await textSelector?.evaluate(el => el.textContent);
2022-07-01 11:52:39 +00:00
// Print the full title
console.log('The title of this blog post is "%s".', fullTitle);
2022-07-01 11:52:39 +00:00
await browser.close();
})();
```
### Default runtime settings
2022-07-01 11:52:39 +00:00
**1. Uses Headless mode**
By default Puppeteer launches Chrome in
[the Headless mode](https://developer.chrome.com/docs/chromium/new-headless/).
```ts
const browser = await puppeteer.launch();
// Equivalent to
const browser = await puppeteer.launch({headless: true});
```
Before v22, Puppeteer launched the [old Headless mode](https://developer.chrome.com/docs/chromium/new-headless/) by default.
The old headless mode is now known as
[`chrome-headless-shell`](https://developer.chrome.com/blog/chrome-headless-shell)
and ships as a separate binary. `chrome-headless-shell` does not match the
behavior of the regular Chrome completely but it is currently more performant
for automation tasks where the complete Chrome feature set is not needed. If the performance
is more important for your use case, switch to `chrome-headless-shell` as following:
```ts
const browser = await puppeteer.launch({headless: 'shell'});
```
To launch a "headful" version of Chrome, set the
[`headless`](https://pptr.dev/api/puppeteer.browserlaunchargumentoptions) to `false`
option when launching a browser:
2022-07-01 11:52:39 +00:00
```ts
const browser = await puppeteer.launch({headless: false});
2022-07-01 11:52:39 +00:00
```
**2. Runs a bundled version of Chrome**
2022-07-01 11:52:39 +00:00
By default, Puppeteer downloads and uses a specific version of Chrome so its
API is guaranteed to work out of the box. To use Puppeteer with a different
version of Chrome or Chromium, pass in the executable's path when creating a
`Browser` instance:
2022-07-01 11:52:39 +00:00
```ts
const browser = await puppeteer.launch({executablePath: '/path/to/Chrome'});
```
You can also use Puppeteer with Firefox. See
[status of cross-browser support](https://pptr.dev/faq#q-what-is-the-status-of-cross-browser-support) for
more information.
2022-07-01 11:52:39 +00:00
See
[`this article`](https://www.howtogeek.com/202825/what%E2%80%99s-the-difference-between-chromium-and-chrome/)
for a description of the differences between Chromium and Chrome.
[`This article`](https://chromium.googlesource.com/chromium/src/+/refs/heads/main/docs/chromium_browser_vs_google_chrome.md)
describes some differences for Linux users.
2022-07-01 11:52:39 +00:00
**3. Creates a fresh user profile**
Puppeteer creates its own browser user profile which it **cleans up on every
run**.
2022-07-01 11:52:39 +00:00
#### Using Docker
See our [Docker guide](https://pptr.dev/guides/docker).
#### Using Chrome Extensions
See our [Chrome extensions guide](https://pptr.dev/guides/chrome-extensions).
2022-07-01 11:52:39 +00:00
## Resources
- [API Documentation](https://pptr.dev/api)
- [Guides](https://pptr.dev/category/guides)
2022-07-01 11:52:39 +00:00
- [Examples](https://github.com/puppeteer/puppeteer/tree/main/examples)
- [Community list of Puppeteer resources](https://github.com/transitive-bullshit/awesome-puppeteer)
## Contributing
Check out our [contributing guide](https://pptr.dev/contributing) to get an
overview of Puppeteer development.
2022-07-01 11:52:39 +00:00
## FAQ
Our [FAQ](https://pptr.dev/faq) has migrated to
[our site](https://pptr.dev/faq).