Package 'cranology'

Title: The CRAN Chronology
Description: Scraping routines and datasets to monitor the evolution of the number of packages on CRAN.
Authors: Antoine Languillaume [aut, cre] , Sebastien Rochette [aut] , Vincent Guyader [aut] , ThinkR [cph]
Maintainer: Antoine Languillaume <[email protected]>
License: MIT + file LICENSE
Version: 1.0.0
Built: 2024-11-10 02:59:44 UTC
Source: https://github.com/ThinkR-open/cranology

Help Index


cran_monthly_package_number

Description

The evolution of the number of packages on CRAN since its beginning. Last update: 2024-09-10.

Usage

cran_monthly_package_number

Format

A data frame with 324 rows and 2 variables:

date

[Date] Month of release.

number_packages

[numeric] Number of packages available on CRAN at that given date.

Source

https://cran.rstudio.com/src/contrib/ and https://packagemanager.posit.co/cran/


cran_packages_history

Description

All packages ever available on CRAN. Last update: 2024-09-10.

Usage

cran_packages_history

Format

A data frame with 312 rows and 10 variables:

file_name

[character] Either the name of the .tar.gz or the name of the archive folder holding the .tar.gzs of all versions ever released of a given package.

date

[POSIXct,POSIXt] The date of upload on CRAN.

time

[character] The time of upload on CRAN.

size

[character] The size of the .tar.gzs. '-' in case of archive folder.

package_name

[character] The name of the package.

last_archived

[POSIXct,POSIXt] The date when one version was last archived.

archive

[logical] Was a version ever archived ?

first_date

[POSIXct,POSIXt] The date of the first release.

n_versions

[integer] The number of versions released.

last_modified

[POSIXct,POSIXt] The date of last release.

Source

https://cran.rstudio.com/src/contrib/


Get first release of package

Description

Scrape every folder of the CRAN archive to retrieve both the date of the first release and the number of versions released for all archived packaged.

Usage

get_package_first_release(package_name)

Arguments

package_name

A character string. The package name.

Value

A tibble with three columns: _package_name_, _first_date_ and _n_versions_.


Plot monthly evolution of package number on CRAN

Description

This function is a convenience tool to quickly draw a line showing the evolution of packages number on CRAN since its beginning. It uses the 'cran_monthly_package_number' dataset.

Usage

plot_cran_monthly_package_number()

Value

A ggplot object

Examples

plot_cran_monthly_package_number()

Get number of package on CRAN on a given date using ppm

Description

This function queries ppm to retrieve the number of package on CRAN on a given date.

Usage

get_package_number_ppm(dates, parallelize = FALSE)

Arguments

dates

A vector of dates. Either a character vector of the form "yyyy-mm-dd" or a vector of class "Date". All dates must be posterior to "2014-09-17", the day of ppm first CRAN snapshot.

parallelize

A logical. If TRUE {furrr} is used to asynchronously scrap ppm.

Value

A data.frame with two columns 'date' and 'n' the number of packages on CRAN at that given 'date'.

Examples

get_package_number_ppm(c("2018-04-10", "2020-03-19"))

Scrape CRAN packages history

Description

This function is the workhorse of {cranology}. It scrapes https://cran.rstudio.com and generates two datasets:

Usage

scrape_cran_history()

Details

* 'cran_packages_history': A data.frame gathering information about every package that has ever been on CRAN including the first release date the number of versions released so far... * 'cran_monthly_package_number': A data.frame holding the number of packages available on CRAN since its beginning. Data is provided on a montly basis.

Value

A list of two data.frames: 'cran_packages_history' and 'cran_monthly_package_number'.

Examples

## Not run: 
scrape_cran_history()

## End(Not run)

Update 'cran_monthly_package_number' dataset

Description

The creation 'cran_monthly_package_number' using 'scrape_cran()' is a long process as theunderlying scrapping operations are time consuming. To more rapidly update 'cran_monthly_package_number' it is easier to rely on data from ppm. This what this function does. It uses 'get_package_number_ppm()' to quickly update the dataset.

Usage

update_monthly_package_number(
  cran_monthly_package_number_df,
  parallelize = FALSE
)

Arguments

cran_monthly_package_number_df

A data.frame similar to the 'cran_monthly_package_number' dataset included within {cranology}.

parallelize

A logical. If TRUE {furrr} is used to asynchronously scrap ppm.

Examples

# Simulate `cran_monthly_package_number` update
date_lag <- 3
df <- cran_monthly_package_number[
 1:(nrow(cran_monthly_package_number) - date_lag),
]
update_monthly_package_number(
 cran_monthly_package_number_df = df
)