How to use APify tool
What is APIfy?
Apify is a fully managed scraper provider. Apify has a marketplace of different scrapers from website parsers, Youtube scrapers, etc. Apify calls these different scrapers "actors". All actors are ran on the Apify platform and each actor has unique inputs and outputs defined by the creator of the actor.
The Apify tool allows an agent to trigger an actor on APIfy, once the actor completes then the crawled data will be passed to TorqAgent and TorqAgent will answer the initial prompt with the crawled data as context.
Example use case
Crawl Youtube for latests videos mentioning a keyword then summarize latest comments from found videos.
https://www.awesomescreenshot.com/video/23970966?key=b58e6843fb9c59e62d1d40edbc3a7bba
How to configure

Tool Name
Tool name is a name the LLM will use identify which tool to use for a specific task. Tool names must be alphanumeric and can not contain any spaces or special characters.

Actor
This is the actor name on APIfy, you can find this value on the actor's page such as this actor page here https://apify.com/streamers/youtube-comments-scraper. Below is a screen shot of where you find the actor value on this page.

Tool Description
This is a description that the LLM uses to understand when to use this tool. An example for using a Youtube Comment Scraper Actor would be "Useful when needing to get video comments from a given Youtube video url".
Input Schema Function
This is a function that returns Zod Schema of what input the LLM can give the tool. Zod schema allows you to define an object of properties with typing, descriptions, defaults, max and minimum values and etc. This schema then get's translated to the model in plain text so the LLM knows how to send data to the tool. You can find full Zod documentation here.
- Your function must be named handle and can not contain any arguments as this is how the tool will execute the function.
- All APIfy input must be defined within a property called input (see example below).
Below is an example for an Input Schema Function that can be used for the input of the Youtube comment scraper here
function handle() {
return z.object({
input: z.object({
startUrls: z.array(z.object({url: z.string().describe('Youtube video urls that we would like to collect comments from. Here is an example video url "https://www.youtube.com/watch?v=oFrgP6C8pRY".')})),
maxComments: z.number().default(10).describe('Max number of results to collect, this defaults to 10')
}),
})
}
Output Parser Function
This function is used to parser the results from the APIfy actor. Every actor has different outputs and a custom parser will need to be created.
- The parser function must be called handle
- The function excepts one argument that is an item of the collection returned from APIfy (ex: handle(item) {....}.
- This must return a Langchain Document object
- The Document object must contain pageContent and metadata.source. This will be used as context when given to the LLM.
Below is an example of an output parser function for a Youtube comment scraper here
function handle(item) {
return new Document({
pageContent: `Comment: ${item.comment}
Author: ${item.author}
Reply Count: ${item.replyCount}
Vote Count: ${item.voteCount}
`,
metadata: { source: item.pageUrl },
})
}
Updated on: 04/04/2024
Thank you!