This patch renames all the function parameters which are executed in page context into 'pageFunction'.
24 KiB
Puppeteer API
Table of Contents
- class: Browser
- class: Page
- page.addScriptTag(url)
- page.click(selector)
- page.close()
- page.evaluate(pageFunction, ...args)
- page.evaluateOnInitialized(pageFunction, ...args)
- page.focus(selector)
- page.frames()
- page.httpHeaders()
- page.injectFile(filePath)
- page.mainFrame()
- page.navigate(url, options)
- page.plainText()
- page.printToPDF(filePath[, options])
- page.screenshot([options])
- page.setContent(html)
- page.setHTTPHeaders(headers)
- page.setInPageCallback(name, callback)
- page.setRequestInterceptor(interceptor)
- page.setUserAgent(userAgent)
- page.setViewportSize(size)
- page.title()
- page.type(text)
- page.uploadFile(selector, ...filePaths)
- page.url()
- page.userAgent()
- page.viewportSize()
- page.waitFor(selector)
- class: Dialog
- class: Frame
- class: Request
- class: Response
- class: InterceptedRequest
- class: Headers
- class: Body
class: Browser
Browser manages a browser instance, creating it with a predefined settings, opening and closing pages. Instantiating Browser class does not necessarily result in launching browser; the instance will be launched when the need will arise.
A typical scenario of using Browser is opening a new page and navigating it to a desired URL:
const {Browser} = require('puppeteer');
const browser = new Browser();
browser.newPage().then(async page => {
await page.navigate('https://example.com');
browser.close();
})
new Browser([options])
options
<Object> Set of configurable options to set on the browser. Can have the following fields:
browser.close()
Closes browser with all the pages (if any were opened). The browser object itself is considered to be disposed and could not be used anymore.
browser.closePage(page)
This is an alias for the page.close()
method.
browser.newPage()
browser.stderr
A Readable Stream that represents the browser process's stderr.
For example, stderr
could be piped into process.stderr
:
const {Browser} = require('puppeteer');
const browser = new Browser();
browser.stderr.pipe(process.stderr);
browser.version().then(version => {
console.log(version);
browser.close();
});
browser.stdout
A Readable Stream that represents the browser process's stdout.
For example, stdout
could be piped into process.stdout
:
const {Browser} = require('puppeteer');
const browser = new Browser();
browser.stdout.pipe(process.stdout);
browser.version().then(version => {
console.log(version);
browser.close();
});
browser.version()
- returns: <Promise<string>> String describing browser version. For headless chromium, this is similar to
HeadlessChrome/61.0.3153.0
. For non-headless, this isChrome/61.0.3153.0
.
Note
the format of browser.version() is not fixed and might change with future releases of the library.
class: Page
Page provides methods to interact with browser page. Page could be thought about as a browser tab, so one Browser instance might have multiple Page instances.
An example of creating a page, navigating it to a URL and saving screenshot as screenshot.png
:
const {Browser} = require('puppeteer');
const browser = new Browser();
browser.newPage().then(async page =>
await page.navigate('https://example.com');
await page.screenshot({path: 'screenshot.png'});
browser.close();
});
page.addScriptTag(url)
url
<string> Url of a script to be added- returns: <Promise> Promise which resolves as the script gets added and loads.
Adds a <script></script>
tag to the page with the desired url. Alternatively, javascript could be injected to the page via page.injectFile
method.
page.click(selector)
selector
<string> A query selector to search for element to click. If there are multiple elements satisfying the selector, the first will be clicked.- returns: <Promise> Promise which resolves when the element matching
selector
is successfully clicked. Promise gets rejected if there's no element matchingselector
.
page.close()
- returns: <Promise> Returns promise which resolves when page gets closed.
page.evaluate(pageFunction, ...args)
pageFunction
<function> Function to be evaluated in browser context...args
<...string> Arguments to pass topageFunction
- returns: <Promise<Object>> Promise which resolves to function return value
This is a shortcut for page.mainFrame().evaluate() method.
page.evaluateOnInitialized(pageFunction, ...args)
pageFunction
<function> Function to be evaluated in browser context...args
<...string> Arguments to pass topageFunction
- returns: <Promise<Object>> Promise which resolves to function
page.evaluateOnInitialized
adds a function which would run on every page navigation before any page's javascript. This is useful to amend javascript environment, e.g. to seed Math.random
page.focus(selector)
selector
<string> A query selector of element to focus. If there are multiple elements satisfying the selector, the first will be focused.- returns: <Promise> Promise which resolves when the element matching
selector
is successfully focused. Promise gets rejected if there's no element matchingselector
.
page.frames()
page.httpHeaders()
- returns: <Object> Key-value set of additional http headers which will be sent with every request.
page.injectFile(filePath)
filePath
<string> Path to the javascript file to be injected into page.- returns: <Promise> Promise which resolves when file gets successfully evaluated in page.
page.mainFrame()
- returns: <Frame> returns page's main frame.
Page is guaranteed to have a main frame which persists during navigations.
page.navigate(url, options)
url
<string> URL to navigate page tooptions
<Object> Navigation parameters which might have the following properties:maxTime
<number> Maximum navigation time in milliseconds, defaults to 30 seconds.waitFor
<string> When to consider navigation succeeded, defaults toload
. Could be either:load
- consider navigation to be finished when theload
event is fired.networkidle
- consider navigation to be finished when the network activity stays "idle" for at leastnetworkIdleTimeout
ms.
networkIdleInflight
<number> Maximum amount of inflight requests which are considered "idle". Takes effect only withwaitFor: 'networkidle'
parameter.networkIdleTimeout
<number> A timeout to wait before completing navigation. Takes effect only withwaitFor: 'networkidle'
parameter.
- returns: <Promise<Response>> Promise which resolves to the main resource response. In case of multiple redirects, the navigation will resolve with the response of the last redirect.
The page.navigate
will throw an error if:
- there's an SSL error (e.g. in case of self-signed certificates).
- target URL is invalid.
- the
maxTime
is exceeded during navigation.
page.plainText()
page.printToPDF(filePath[, options])
filePath
<string> The file path to save the image to. The screenshot type will be inferred from file extensionoptions
<Object> Options object which might have the following properties:- returns: <Promise> Promise which resolves when the PDF is saved.
page.screenshot([options])
options
<Object> Options object which might have the following properties:path
<string> The file path to save the image to. The screenshot type will be inferred from file extension.type
<string> Specify screenshot type, could be eitherjpeg
orpng
.quality
<number> The quality of the image, between 0-100. Not applicable to.png
images.fullPage
<boolean> When true, takes a screenshot of the full scrollable page.clip
<Object> An object which specifies clipping region of the page. Should have the following fields:
- returns: <Promise<Buffer>> Promise which resolves to buffer with captured screenshot
page.setContent(html)
html
<string> HTML markup to assign to the page.- returns: <Promise> Promise which resolves when the content is successfully assigned.
page.setHTTPHeaders(headers)
headers
<Object> Key-value set of additional http headers to be sent with every request.- returns: <Promise> Promise which resolves when additional headers are installed
page.setInPageCallback(name, callback)
name
<string> Name of the callback to be assigned on window objectcallback
<function> Callback function which will be called in puppeteer's context.- returns: <Promise> Promise which resolves when callback is successfully initialized
The in-page callback allows page to asynchronously reach back to the Puppeteer. An example of a page showing amount of CPU's:
const os = require('os');
const {Browser} = require('puppeteer');
const browser = new Browser();
browser.newPage().then(async page =>
await page.setInPageCallback('getCPUCount', () => os.cpus().length);
await page.evaluate(async () => {
alert(await window.getCPUCount());
});
browser.close();
});
page.setRequestInterceptor(interceptor)
interceptor
<function> Callback function which accepts a single argument of type <InterceptedRequest>.- returns: <Promise> Promise which resolves when request interceptor is successfully installed on the page.
After the request interceptor is installed on the page, every request will be reported to the interceptor. The InterceptedRequest could be modified and then either continued via the continue()
method, or aborted via the abort()
method.
En example of a naive request interceptor which aborts all image requests:
const {Browser} = require('puppeteer');
const browser = new Browser();
browser.newPage().then(async page =>
await page.setRequestInterceptor(interceptedRequest => {
if (interceptedRequest.url.endsWith('.png') || interceptedRequest.url.endsWith('.jpg'))
interceptedRequest.abort();
else
interceptedRequest.continue();
});
await page.navigate('https://example.com');
browser.close();
});
page.setUserAgent(userAgent)
userAgent
<string> Specific user agent to use in this page- returns: <Promise> Promise which resolves when the user agent is set.
page.setViewportSize(size)
size
<Object> An object with two fields:- returns: <Promise> Promise which resolves when the dimensions are updated.
The page's viewport size defines page's dimensions, observable from page via window.innerWidth / window.innerHeight
. The viewport size defines a size of page
screenshot (unless a fullPage
option is given).
In case of multiple pages in one browser, each page can have its own viewport size.
page.title()
page.type(text)
text
<string> A text to type into a focused element.- returns: <Promise> Promise which resolves when the text has been successfully typed.
page.uploadFile(selector, ...filePaths)
selector
<string> A query selector to a file input...filePaths
<string> Sets the value of the file input these paths- returns: <Promise> Promise which resolves when the value is set.
page.url()
page.userAgent()
- returns: <string> Returns user agent.
page.viewportSize()
- returns: <Object> An object with two fields:
page.waitFor(selector)
selector
<string> A query selector to wait for on the page.- returns: <Promise> Promise which resolves when the element matching
selector
appears in the page.
Shortcut for page.mainFrame().waitFor(selector).
class: Dialog
dialog.accept([promptText])
promptText
<string> A text to enter in prompt. Does not cause any effects if the dialog'stype
is not prompt.- returns: <Promise> Promise which resolves when the dialog has being accepted.
dialog.dismiss()
- returns: <Promise> Promise which resolves when the dialog has being dismissed.
dialog.message()
- returns: <string> A message displayed in the dialog.
dialog.type
- <string>
Dialog's type, could be one of the alert
, beforeunload
, confirm
and prompt
.
class: Frame
frame.childFrames()
frame.evaluate(pageFunction, ...args)
pageFunction
<function> Function to be evaluated in browser context...args
<Array<string>> Arguments to pass topageFunction
- returns: <Promise<Object>> Promise which resolves to function return value
If the function, passed to the page.evaluate
, returns a Promise, then page.evaluate
would wait for the promise to resolve and return it's value.
const {Browser} = require('puppeteer');
const browser = new Browser();
browser.newPage().then(async page =>
const result = await page.evaluate(() => {
return Promise.resolve().then(() => 8 * 7);
});
console.log(result); // prints "56"
browser.close();
});
frame.isDetached()
- returns: <boolean>
Returns true
if the frame has being detached, or false
otherwise.
frame.isMainFrame()
- returns: <boolean>
Returns true
is the frame is page's main frame, or false
otherwise.
frame.name()
- returns: <string>
Returns frame's name as specified in the tag.
frame.parentFrame()
- returns: <Frame> Returns parent frame, if any. Detached frames and main frames return
null
.
frame.url()
- returns: <string>
Returns frame's url.
frame.waitFor(selector)
selector
<string> CSS selector of awaited element,- returns: <Promise> Promise which resolves when element specified by selector string is added to DOM.
Wait for the selector
to appear in page. If at the moment of calling
the method the selector
already exists, the method will return
immediately.
class: Request
Request class represents requests which are sent by page. Request implements Body mixin, which in case of HTTP POST requests allows clients to call request.json()
or request.text()
to get different representations of request's body.
request.headers
- <Headers>
Contains the associated Headers object of the request.
request.method
- <string>
Contains the request's method (GET, POST, etc.)
request.response()
request.url
- <string>
Contains the URL of the request.
class: Response
Response class represents responses which are received by page. Response implements Body mixin, which allows clients to call response.json()
or response.text()
to get different representations of response body.
response.headers
- <Headers>
Contains the Headers object associated with the response.
response.ok
- <boolean>
Contains a boolean stating whether the response was successful (status in the range 200-299) or not.
response.request()
response.status
- <number>
Contains the status code of the response (e.g., 200 for a success).
response.statusText
- <string>
Contains the status message corresponding to the status code (e.g., OK for 200).
response.url
- <string>
Contains the URL of the response.
class: InterceptedRequest
InterceptedRequest represents an intercepted request, which can be mutated and either continued or aborted. InterceptedRequest which is not continued or aborted will be in a 'hanging' state.
interceptedRequest.abort()
Aborts request.
interceptedRequest.continue()
Continues request.
interceptedRequest.headers
- <Headers>
Contains the Headers object associated with the request.
Headers could be mutated with the headers.append
, headers.set
and other
methods. Must not be changed in response to an authChallenge.
interceptedRequest.isHandled()
- returns: <boolean> returns
true
if eitherabort
orcontinue
was called on the object. Otherwise, returnsfalse
.
interceptedRequest.method
- <string>
Contains the request's method (GET, POST, etc.)
If set this allows the request method to be overridden. Must not be changed in response to an authChallenge.
interceptedRequest.postData
- <string>
Contains POST
data for POST
requests.
request.postData
is mutable and could be written to. Must not be changed in response to an authChallenge.
interceptedRequest.url
- <string>
If changed, the request url will be modified in a way that's not observable by page. Must not be changed in response to an authChallenge.
class: Headers
headers.append(name, value)
If there's already a header with name name
, the header gets overwritten.
headers.delete(name)
name
<string> Case-insensetive name of the header to be deleted. If there's no header with such name, the method does nothing.
headers.entries()
- returns: <iterator> An iterator allowing to go through all key/value pairs contained in this object. Both the key and value of each pairs are string objects.
headers.get(name)
name
<string> Case-insensetive name of the header.- returns: <string> Header value of
null
, if there's no such header.
headers.has(name)
name
<string> Case-insensetive name of the header.- returns: <boolean> Returns
true
if the header with such name exists, orfalse
otherwise.
headers.keys()
- returns: <iterator> an iterator allowing to go through all keys contained in this object. The keys are string objects.
headers.set(name, value)
If there's already a header with name name
, the header gets overwritten.
headers.values()
- returns: <iterator<string>> Returns an iterator allowing to go through all values contained in this object. The values are string objects.
class: Body
body.arrayBuffer()
- returns: <Promise<ArrayBuffer>>
body.bodyUsed
- returns: <boolean>
body.buffer()
- returns: <Promise<Buffer>>
body.json()
- returns: <Promise<Object>>
body.text()
- returns: <Promise<[text]>>