This patch makes it easier to see exactly how to use the registerCustomQueryHandler API without having to follow the trail of breadcrumbs throughout the docs.
prior command installs stable version of chrome (RUN command), but after that, the docs is referring to unstable version.
added correct commands for troubleshooting in linux environment.
By adding support for an environment variable `PUPPETEER_DOWNLOAD_PATH` it is possible to support downloading the browser binaries into a folder outside the `node_modules` folder. This makes it possible to preserve previously downloaded binaries in order to skip downloading them again.
* chore: Use devtools-protocol package
Rather than maintain our own protocol we can instead use the devtools-protocol package and pin it to the version of Chromium that Puppeteer is shipping with.
The only changes are naming changes between the bespoke protocol that Puppeteer created and the devtools-protocol one.
This change started as a small change to pull types from DefinitelyTyped over to
Puppeteer for the `evaluateHandle` function but instead ended up also fixing
what looks to be a long standing issue with our existing documentation.
`evaluateHandle` can in fact return an `ElementHandle` rather than a `JSHandle`.
Note that `ElementHandle` extends `JSHandle` so whilst the docs are technically
correct (all ElementHandles are JSHandles) it's confusing because JSHandles
don't have methods like `click` on them, but ElementHandles do.
if you return something that is an HTML element:
```
const button = page.evaluateHandle(() => document.querySelector('button'));
// this is an ElementHandle, not a JSHandle
```
Therefore I've updated the original docs and added a large explanation to the
TSDoc for `page.evaluateHandle`.
In TypeScript land we'll assume the function will return a `JSHandle` but you
can tell TS otherwise via the generic argument, which can only be `JSHandle`
(the default) or `ElementHandle`:
```
const button = page.evaluateHandle<ElementHandle>(() => document.querySelector('button'));
```
As far as I can tell these became irrelevant as of v1.15 which added
`puppeteer.errors` and `puppeteer.devices [1]. This is a breaking change
but one that's easily mitigated. We've said that we don't consider
changes to our folder/file structure a breaking change, but we can't
really do that if we have these two top level files that we've
documented.
[1]: e3abb0aa32 (diff-522b24108d7446af4c59873472a90444)
Just one was used externally and I wrapped that up in a method. I think
it's a useful method to provide (I can imagine wanting to know if JS is
enabled on a page) so I think there's no harm here (I'd rather that then
have JSHandle reach into a private variable).
Replacing the Node EventEmitter with Mitt caused more problems than
anticipated for end users due to the API differences and the amount of
people who relied on the EventEmitter API. In hindsight this clearly
should have been explored more and then released as a breaking v4.
This commit rolls us back to the built in Node EventEmitter library
which we can release to get everyone back on stable builds. We can then
consider our approach to migrating to Mitt and when we do do that we can
release it as a breaking change and properly document the migration
strategy and approach.
* chore: migrate to Mitt as the EventEmitter
This commit moves us to using Mitt [1] for the event emitter in
Puppeteer. This removes our dependency to Node's EventEmitter which is
part of a larger stream of work to enable a Puppeteer-web version that
doesn't depend on Node.
There are no large breaking changes as we support the main methods that
EventEmitter had, but it also provides some methods that Puppeteer
didn't use. Technically end users could depend on this but it's
unlikely.
[1]: https://github.com/developit/mitt
It conflicts with an inbuilt TypeScript `Request` type so can cause
confusion when in TS land. Note: `Response.ts` and `Worker.ts` also
suffer from this; PRs to rename them are incoming.
Fixes an edge case where Puppeteer looked for a Chromium revision when launching Firefox.
Allow appropriate Launcher to be instantiated when calling `Puppeteer.connect`.
Add an example of running Firefox.
The recommended Dockerfile uses node:10-slim image as a base,
but the base image does not contain wget command anymore.
(About the reason, see https://github.com/nodejs/docker-node/issues/1185)
So fixed the problem.
* (feat) Add option to fetch Firefox Nightly
Add Firefox support to BrowserFetcher and the install script.
By default, the latest Firefox Nightly is downloaded
directly from archive.mozilla.org (dmg, tar.bz2 and zip)
This also required changes that impact `puppeteer.launch()`
and `puppeteer.executablePath()`
Fixes#5151
* Update docs/api.md
Co-Authored-By: Mathias Bynens <mathias@qiwi.be>
* Clean up revision promise
* Improve error handling in revision check
* Remove matchAll
* Use explicit octal mode
* Update .gitignore
Co-authored-by: Mathias Bynens <mathias@qiwi.be>
* chore: update relevant Node.js versions from 8 to 10
* chore: remove node6 and node8 folders from puppeteer-firefox ci
* fix: loosen definition for proc.stdio
* fix: update typescript version used in npm run test-types
This makes it more clear that the callback receives an actual array of nodes instead of just a NodeList.
Co-authored-by: Mathias Bynens <mathias@qiwi.be>
This changes the Chromium revision to r722234 (Chrome 80.0.3987.0),
since that's the most recent version in the Chromium 80 range for
which a download exists for all supported platforms.
* feat: Set which browser to launch via PUPPETEER_PRODUCT
This change introduces a PUPPETEER_PRODUCT environment
variable as a first step toward using Puppeteer with
many different browsers. Setting PUPPETEER_PRODUCT=firefox, for
example, enables Firefox-specific Launcher settings.
The state is also exposed as `puppeteer.product` in the API
to support adding other product-specific behaviour as needed.
The bulk of the change is a refactoring in Launcher
to decouple generic browser start-up from product-specific
configuration.
Respecting the puppeteer-core restriction for PUPPETEER_
environment variables, lazily instantiate the Launcher
based on a `product` Puppeteer.launch option, if available.
* test: Distinguish Juggler unit tests from Firefox
The funit script is renamed to fjunit (j for Juggler, which is
used only by the experimental puppeteer-firefox package.
In contrast, the funit script now refers to running Puppeteer
unit tests against the main puppeteer package with Firefox.
To do so with Firefox Nightly, run:
`BINARY=path/to/firefox npm run funit`
A number of changes in this patch make it easier to run
Puppeteer unit tests in Mozilla's CI.
Adds note about Jest maxWorkers as well as the base image to start with. This is an improvement over the previous section I wrote, from me banging my head against a YAML file all week 🙃.
This came from personal difficulties in running Puppeteer tests on CircleCI. I tried to keep the note as brief as possible, while being helpful for an entire CI platform.
This patch introduces a page.waitForFileChooser() method
that adds a watchdog to wait for file chooser dialogs.
This lets Puppeteer users to capture file chooser requests
and fulfill/cancel them if necessary.
Fixes#2946
A freetype update broke bitmap fonts. Adding freetype-dev to the Alpine dependencies resolves related issues.
This resolves these errors when launching chromium:
```/usr/bin/chromium-browser
Error relocating /usr/lib/chromium/chrome: FT_Get_Color_Glyph_Layer: symbol not found
Error relocating /usr/lib/chromium/chrome: FT_Palette_Select: symbol not found```
Background:
https://bugs.alpinelinux.org/issues/10309https://github.com/stark/siji/issues/28https://github.com/lucy/tewi-font/issues/35
The documentation for frame.goto() and page.goto() were updated to make
it clear that the method will not throw an error if the HTTP requests
results in any valid HTTP status code being returned by the remote
server.
Going from `AXNode` -> `ElementHandle` is turning out to be controversial.
This patch instead adds a way to go from `ElementHandle` -> `AXNode`. If the API looks good, I'll add it into Firefox as well.
References #3641
- Pins Alpine Chromium version to prevent updates from causing issues
When using these instructions today, I found that the Chromium version in the latest Alpine edge was 73, which caused errors with Puppeteer.
Pinning to the latest 72 version in Alpine registry resolved the issue, and should prevent others from running into it in the future.
This roll includes:
- https://crrev.com/653809 - FrameLoader: ignore failing provisional loads entirely
- https://crrev.com/654750 - DevTools: make sure Network.requestWillBeSent is emitted on time for sync xhrs
The FrameLoader patch is the reason behind the test change. It's
actually desirable to fail frame navigation if the frame detaches - and
that's consistent with Firefox.
Fixes#4337
These getters are introduced as a more convenient substitute for
a `require('puppeteer/Errors')` and
`require('puppeteer/DeviceDescriptors')`.
This way we can make cross-browser story nicer - a single require
of `puppeteer` or `puppeteer-firefox` fully defines Puppeteer
environment.
* removing libgconf-2-4 install since no longer needed according to https://bugs.chromium.org/p/chromium/issues/detail?id=795759#c7
* wget is already included in `node:8-slim` image, so removed lines related to install/cleanup
* node 8 has EOL this year, so incremented to node:10-slim
* use "docker run --init" if available (available in docker-engine >= 1.13.0)
* make dumb-init optional
* combine permission changes and 'npm install' of puppeteer into same line to reduce image size by few hundred MB
* overall image size reduction: 1.21GB -> 865MB
A link on line 525 is pointing to a undefined branch (`lkcr`) in `chromium.googlesource.com/chromium/src/`. Change it to point to `lkgr` instead, since it's the closest defined branch in name.
Method `page.setDefaultTimeout` overrides default 30 seconds timeout
for all `page.waitFor*` methods, including navigation and waiting
for selectors.
Fix#3319.
`page.waitForSelector` should return `null` if waiting for `hidden:
true` and there's no matching node in DOM.
Before this patch, `page.waitForSelector` would return some JSHandle
pointing to boolean value.
Makes Running on Alpine up to date:
- Chrome is now available in LTS Node 10
- Chrome version is updated to the latest alpine `@edge`, 71
- Corresponding Puppeteer is updated to v1.9.0
- `harfbuzz` is now required by dynamic linking
The proposal adds a drop-down list in a similar fashion as Dependencies list since it feels a little weird to have list for a continuing detail as I assume that all the three options belong to the same level of information.
Very small change in light of operational experience while getting it running on Centos in Jenkins pipeline.
Without the `-p`, the permissions set in the `chmod` before this command are not carried over chrome cannot start.
This patch teaches `page.setContent` to await resources in
the new document.
**NOTE**: This patch changes behavior: currently, `page.setContent`
awaits the `"domcontentloaded"` event; with this patch, we can now await
other lifecycle events, and switched default to the `"load"` event.
The change is justified since current behavior made `page.setContent`
unusable for its main designated usecases, pushing our client
to use [dataURL workaround](https://github.com/GoogleChrome/puppeteer/issues/728#issuecomment-334301491).
Fixes#728
This adds `page.accessibility.snapshot()`. It serializes and returns the accessibility tree for the page. By default, uninteresting nodes are filtered out of the snapshot.
fixes#2033
This adds `browser.waitForTarget` and `browserContext.waitForTarget`. It also fixes a flaky test that was incorrectly expecting targets to appear instantly.
This patch:
- adds experimental "transport" option to pptr.connect
- uses "transport" option to make sure Puppeteer-Web works with
Target.exposeDevToolsProtocol
Drive-by: add `browser.target()` to access browser target.
This patch introduces API to manage frame navigations.
As a drive-by, the `response.frame()` method is added as a shortcut
for `response.request().frame()`.
Fixes#2918.
We had (and still have) a ton of pull requests to support
PUPPETEER_EXECUTABLE_PATH and PUPPETEER_CHROMIUM_REVISION in puppeteer launcher.
We were hesitant before since env variables are not scoped
and thus don't make a good interface for a library. Now, since we
determined `puppeteer-core` as a library and `puppeteer` as our end-user
product, it's safe to satisfy our user needs.
This patch:
- teaches PUPPETEER_EXECUTABLE_PATH and PUPPETEER_CHROMIUM_REVISION
env variables to control how Puppeteer launches browser
- makes sure these variables play no role in `puppeteer-core` package.
If referer is passed to the options object its value will be used as the referer instead of the value set by `Page.setExtraHTTPHeaders()`.
This is the correct way to set referer header: otherwise, the `referer` header will override all the document subrequests.
Fixes#3090.
Introduce an API to manage permissions per browser context:
- BrowserContext.overridePermissions(origin, permissions)
- BrowserContext.clearPermissionOverrides()
Fixes#846.
It turned out that almost any usecase requires helper methods to access
DOM inside the ExecutionContext.
Instead of exposing execution contexts as-is, we should introduce
IsolatedWorld as a first-class citizen that will hold execution contexts
inside.
This patch adds a new require, `puppeteer/Errors`, that
holds all the Puppeteer-specific error classes.
Currently, the only custom error class we use is `TimeoutError`. We'll
expand in future with `CrashError` and some others.
Fixes#1694.
This patch adds `reportAnonymousScripts` option to the `coverage.startJSCoverage` method. With this option, anonymous scripts are reported as well.
Fixes#2777
Adds guidance for producing accurate colors in PDF output. page.pdf() can produce unexpected document colors unless forced to render exact colors.
Fixes#2685
This patch:
- adds `worker.evaluate` and `worker.evaluateHandle` methods as a shortcut to their execution context equivalents.
- makes the error messages a bit nicer when interacting with a closed worker (as opposed to a closed page).
- moves the worker tests into their own spec file.
This patch drops the markdown-toc module and instead rolls out
our own simple markdown table-of-contents generator.
As a side effect, it fixes links to `page.$` and `page.$$`.
Docs about `page.$$eval` and `frame.$$eval` are not accurate and might be confusing. `document.querySelectorAll` returns `NodeList`, while `frame.$$eval` is actually doing `Array.from(querySelectorAll(selector))`, which actually returns an array.
This makes things this possible:
`await page.$$eval('div', divs => divs.map...)`
This patch fixes docs to mention that $$eval is actually performing:
`Array.from(querySelectorAll(selector))`
Which will let the user understand that the element he receives is an array, and not a NodeList.
This adds `page.workers()`, and two events `workercreated` and `workerdestroyed`. It also forwards logs from a worker into the page `console` event.
Only dedicated workers are supported for now, ServiceWorkers will probably work differently because they aren't necessarily associated with a single page.
Fixes#2350.
This patch introduces Browser Contexts and methods to manage them:
- `browser.createIncognitoBrowserContext()` - to create new incognito
context
- `browser.browserContext()` - to get all existing contexts
- `browserContext.dispose()` - to dispose incognito context.
Fixes#85.
Today, `page.close()` method doesn't run page's beforeunload listeners.
This way users can be sure that `page.close()` actually closes the
page.
This patch adds an optional `runBeforeUnload` option to the
`page.close()` method that would run beforeunload listeners. Note:
running beforeunload handlers might cancel page closing.
Fixes#2386.
Last release v1.3.0 had an error in the documentation, claiming
it wasn't released.
This patch makes sure we have a little bit of automation in place
to save us from this in future.
This patch introduces a new `pipe` option to the launcher to connect over a pipe.
In certain environments, exposing web socket for remote debugging is a security risk.
Pipe connection eliminates this risk.
This patch adds support for `timeout: 0` to disable timeout for the following functions:
- `page.waitForFunction`
- `page.waitForXPath`
- `page.waitForSelector`
and their `frame` counterparts.
Fixes#2200
This patch:
- starts fulfilling security details for redirect responses
- changes `response.securityDetails()` to return null if the response
is served over non-secure connection
This patch introduces ExecutionContext.frame() that returns Frame
associated with this Execution Context.
This allows to associate console messages with the originating frame,
if any.
This patch:
- introduces `SecurityDetails` class that exposes a set of fields that describe properties of secure connection
- introduces method `response.securityDetails()` that returns an instance of `SecurityDetails` object.
This patch introduces `BrowserFetcher` class that manages
downloaded versions of products.
This patch:
- shapes Downloader API to be minimal yet usable for our needs. This
includes removing such methods as `Downloader.supportedPlatforms` and
`Downloader.defaultRevision`.
- makes most of the fs-related methods in Downloader async. The only
exception is the `Downloader.revisionInfo`: it has stay sync due to the
`pptr.executablePath()` method being sync.
- updates `install.js` and `utils/check_availability.js` to use new API
- finally, renames `Downloader` into `BrowserFetcher`
Fixes#1748.
This patch:
- introduces `test/assets/cached` folder and teaches server to cache
all the assets from the folder
- introduces `test/assets/serviceworkers` folder that stores all the
service workers and makes them register with unique URL prefix
- introduces `Response.fromCache()` and `Response.fromServiceWorker()`
methods
Fixes#1551.
This patch:
- migrates CI to use NPM
- drops lockfiles (`yarn.lock`). Lockfiles are ignored by package
managers when the package is installed as a dependency, so this makes CI closer to the
installation our clients run.
This patch introduces a `slowMo` option to the `puppeteer.connect` method. The option
is similar to the one in `puppeteer.launch` and is used to slow down the connection.
This patch:
- introduces `page.waitForXPath` method
- introduces `frame.waitForXPath` method
- amends `page.waitFor` to treat strings that start with `//` as xpath queries.
Fixes#1757.
feat: expose raw devtools protocol connection
This patch introduces `target.createCDPSession` method that
allows directly communicating with the target over the
Chrome DevTools Protocol.
Fixes#31.
This patch:
- teaches page.waitFor* methods to accept JSHandles
- starts returning JSHandles from page.waitFor* calls.
BREAKING CHANGE: this patch starts allocating `JSHandle`/`ElementHandle` instances for every call to `page.waitFor*` functions. These handles should be disposed manually to avoid memory consumption.
Fixes#1703, fixes#1654, fixes#1724.
This patch adds two new methods to the `page.coverage` namespace:
- `page.coverage.startCSSCoverage()` - to initiate css coverage
- `page.coverage.stopCSSCoverage()` - to stop css coverage
The coverage format is consistent with the JavaScript coverage.
This patch introduces a new `page.coverage` namespace with two methods:
- `page.coverage.startJSCoverage` to initiate JavaScript coverage
recording
- `page.coverage.stopJSCoverage` to stop JavaScript coverage and get
results
This patch:
- adds `puppeteer.defaultArgs()` method to get default arguments that are used to launch chrome
- adds `ignoreDefaultArgs` option to `puppeteer.launch` to avoid using default puppeteer arguments
Fixes#872
The patch converts all the getters in the codebase into the methods.
For example, the `request.url` getter becomes the `request.url()`
method.
This is done in order to unify the API and make it more predictable.
The general rule for all further changes would be:
- there are no getters/fields exposed in the api
- the only exceptions are "namespaces", e.g. `page.keyboard`
Fixes#280.
BREAKING CHANGE:
This patch ditches getters and replaces them with methods throughout
the API. The following methods were added instead of the fields:
- dialog.type()
- consoleMessage.args()
- consoleMessage.text()
- consoleMessage.type()
- request.headers()
- request.method()
- request.postData()
- request.resourceType()
- request.url()
- response.headers()
- response.ok()
- response.status()
- response.url()
This refactors the page.content and page.setContent methods to be defined on the Frame class. This allows access from the Page still but also on all frames.
Fixes#754
In Blink, frames don't necesserily have execution context all the time.
DevTools Protocol precisely reports this situation, which results in
Puppeteer's frame.executionContext() being null occasionally.
However, from puppeteer point of view every frame will have at least a
default executions context, sooner or later:
- frame's execution context might be created naturally to run frame's
javascript
- if frame has no javascript, devtools protocol will issue execution
context creation
This patch builds up on this assumption and makes frame.executionContext()
to be a promise.
As a result, all the evaluations await for the execution context to be created first.
Fixes#827, #1325
BREAKING CHANGE: this patch changes frame.executionContext() method to return a promise.
To migrate onto a new behavior, await the context first before using it.
The SIGHUP signal is sent whenever the controlling terminal is closed.
On Windows, SIGHUP is emulated by libuv, and will be the only signal we
receive before the application will be terminated.
This patch starts handling SIGHUP in the same way we handle SIGTERM.
Fixes#1367.
SIGTERM signal is widely used to notify application that it will be shut down.
This patch starts listening to SIGTERM event to gracefully retire
chromium instance.
References #1047.
This patch adds `Frame.select` method that does the same functionality as
former `Page.select`, but on a per-frame level.
The `Page.select` method becomes a shortcut for the ÷main frame's select.
Fixes#1139
This patch migrates puppeteer to support PlzNavigate chromium
project.
As a consequence of this patch, we no longer wait for both
requestWillBeSent and requestIntercepted events to happen. This should
resolve a ton of request interception bugs that "hanged" the loading.
Fixes#877.
This adds more examples for using `keyboard.type` and `keyboard.press`. It adds warnings about Shift affecting or not affecting the text generated by certain methods.
fixes#723
The jontewks buildpack worked great for running Chrome headless, but it did not include support for Chinese characters and the PDF rendering I was attempting was not usable. This fix brings in a few custom fonts to allow PDF generation to utilize the fonts.
The idea came from https://github.com/dscout/wkhtmltopdf-buildpack/pull/14/files
This roll includes:
- crrev.com/510651 that changes request interception methods in protocol
- s/Page.setRequestInterceptionEnabled/Page.setRequestInterception
BREAKING CHANGE
Page.setRequestInterceptionEnabled is renamed into
Page.setRequestInterception.
This patch adds "options" parameter to the `page.setContent` method. The
parameter is the same as a navigation parameter and allows to specify
maximum timeout to wait for resources to be loaded, as well as to
describe events that should be emitted before the setContent operation
would be considered successful.
Fixes#728.
This patch adds support to multiple events that could be passed inside
navigation methods:
- Page.goto
- Page.waitForNavigation
- Page.goForward
- Page.goBack
- Page.reload
Fixes#805
This patch adds a new `domcontentloaded` option to a bunch of navigation
methods:
- Page.goto
- Page.waitForNavigation
- Page.goBack
- Page.goForward
- Page.reload
Fixes#946.
This patch:
- migrates navigation watcher to use protocol-issued lifecycle events.
- removes `networkIdleTimeout` and `networkIdleInflight` options for
`page.goto` method
- adds a new `networkidle0` value to the waitUntil option of navigation
methods
References #728.
BREAKING CHANGE:
As an implication of this new approach, the `networkIdleTimeout` and
`networkIdleInflight` options are no longer supported. Interested
clients should implement the behavior themselves using the `request` and
`response` events.
BREAKING CHANGE:
This patch lets key names be code in addition to key. When specifying a code, the proper text is generated assuming a standard US keyboard layout. e.g Digit5 -> "5" or "%" depending on Shift.
* location is now specified. #777
* Using unknown key names now throws an error. #723
* Typing newlines now correctly presses enter. #681
feat(interception): Implement request.respond method
This patch implements a new Request.respond method. This
allows users to fulfill the intercepted request with a hand-crafted
response if they wish so.
References #1020.
Currently, JSHandle.jsonValue() is implemented as in-page JSON.stringify
call and consequent JSON.parse in node. This approach proved to be
unfortunate for automation purposes: if page author overrode the
Object.prototype.toJSON method, then it's harder for puppeteer to
interact with the page.
This patch switches JSHandle.jsonValue to use protocol serialization
that ignores toJSON property. THis also changes the `page.evaluate`
behavior since it is based on JSHandle.jsonValue().
Fixes#1003.
BREAKING CHANGE:
`page.evaluate` no longer calls toJSON when generating return value.
For the old behavior, do JSON.parse/JSON.stringify manually:
```js
const json = JSON.parse(await page.evaluate(() => JSON.stringify(obj)));
```
This patch:
- introduces Target class that represents any inspectable target, such as service worker or page
- emits events when targets come and go
- introduces target.page() to instantiate a page from a target
Fixes#386, fixes#443.
Similarly to the `request.response()` method, this patch adds
`request.failure()` method that returns error details for the failed
requests.
Fixes#901.
This patch:
- changes `browser.close` to terminate browser.
- introduces new `browser.disconnect` to disconnect from a browser without closing it
This patch: fixes#918, fixes#989
BREAKING CHANGE:
`browser.close()` will always close a browser, even if it was initialized with
`puppeteer.connect`. To disconnect from a remote browser, use `browser.disconnect()` instead.