libwget-robots(3) Introduction to Library Functions libwget-robots(3)
libwget-robots - Robots Exclusion file parser
struct wget_robots_st
#define parse_record_field(d, f)
int wget_robots_parse (wget_robots **_robots, const char *data, const
char *client)
void wget_robots_free (wget_robots **robots)
int wget_robots_get_path_count (wget_robots *robots)
wget_string * wget_robots_get_path (wget_robots *robots, int index)
int wget_robots_get_sitemap_count (wget_robots *robots)
const char * wget_robots_get_sitemap (wget_robots *robots, int index)
Detailed Description
The purpose of this set of functions is to parse a Robots Exclusion
Standard file into a data structure for easy access.
Macro Definition Documentation
#define parse_record_field( d, f)
Value:
parse_record_field(d, f, sizeof(f) - 1)
Function Documentation
int wget_robots_parse (wget_robots ** _robots, const char * data, const
char * client)
Parameters
data Memory with robots.txt content (with trailing 0-byte)
client Name of the client / user-agent
Returns
Return an allocated wget_robots structure or NULL on error
The function parses the robots.txt data in accordance to
https://www.robotstxt.org/orig.html#format and returns a ROBOTS
structure including a list of the disallowed paths and including a
list of the sitemap files.
The ROBOTS structure has to be freed by calling wget_robots_free().
void wget_robots_free (wget_robots ** robots)
Parameters
robots Pointer to Pointer to wget_robots structure
wget_robots_free() free's the formerly allocated wget_robots
structure.
int wget_robots_get_path_count (wget_robots * robots)
Parameters
robots Pointer to instance of wget_robots
Returns
Returns the number of paths listed in robots
wget_string * wget_robots_get_path (wget_robots * robots, int index)
Parameters
robots Pointer to instance of wget_robots
index Index of the wanted path
Returns
Returns the path at index or NULL
int wget_robots_get_sitemap_count (wget_robots * robots)
Parameters
robots Pointer to instance of wget_robots
Returns
Returns the number of sitemaps listed in robots
const char * wget_robots_get_sitemap (wget_robots * robots, int index)
Parameters
robots Pointer to instance of wget_robots
index Index of the wanted sitemap URL
Returns
Returns the sitemap URL at index or NULL
Author
Generated automatically by Doxygen for wget2 from the source code.
wget2 Version 2.2.1 libwget-robots(3)
NAME
libwget-robots - Robots Exclusion file parser
SYNOPSIS
Data Structures
struct wget_robots_st
Macros
#define parse_record_field(d, f)
Functions
int wget_robots_parse (wget_robots **_robots, const char *data, const
char *client)
void wget_robots_free (wget_robots **robots)
int wget_robots_get_path_count (wget_robots *robots)
wget_string * wget_robots_get_path (wget_robots *robots, int index)
int wget_robots_get_sitemap_count (wget_robots *robots)
const char * wget_robots_get_sitemap (wget_robots *robots, int index)
Detailed Description
The purpose of this set of functions is to parse a Robots Exclusion
Standard file into a data structure for easy access.
Macro Definition Documentation
#define parse_record_field( d, f)
Value:
parse_record_field(d, f, sizeof(f) - 1)
Function Documentation
int wget_robots_parse (wget_robots ** _robots, const char * data, const
char * client)
Parameters
data Memory with robots.txt content (with trailing 0-byte)
client Name of the client / user-agent
Returns
Return an allocated wget_robots structure or NULL on error
The function parses the robots.txt data in accordance to
https://www.robotstxt.org/orig.html#format and returns a ROBOTS
structure including a list of the disallowed paths and including a
list of the sitemap files.
The ROBOTS structure has to be freed by calling wget_robots_free().
void wget_robots_free (wget_robots ** robots)
Parameters
robots Pointer to Pointer to wget_robots structure
wget_robots_free() free's the formerly allocated wget_robots
structure.
int wget_robots_get_path_count (wget_robots * robots)
Parameters
robots Pointer to instance of wget_robots
Returns
Returns the number of paths listed in robots
wget_string * wget_robots_get_path (wget_robots * robots, int index)
Parameters
robots Pointer to instance of wget_robots
index Index of the wanted path
Returns
Returns the path at index or NULL
int wget_robots_get_sitemap_count (wget_robots * robots)
Parameters
robots Pointer to instance of wget_robots
Returns
Returns the number of sitemaps listed in robots
const char * wget_robots_get_sitemap (wget_robots * robots, int index)
Parameters
robots Pointer to instance of wget_robots
index Index of the wanted sitemap URL
Returns
Returns the sitemap URL at index or NULL
Author
Generated automatically by Doxygen for wget2 from the source code.
wget2 Version 2.2.1 libwget-robots(3)