Package: dataMaid 1.4.0

dataMaid: A Suite of Checks for Identification of Potential Errors in a Data Frame as Part of the Data Screening Process

Data screening is an important first step of any statistical analysis. dataMaid auto generates a customizable data report with a thorough summary of the checks and the results that a human can use to identify possible errors. It provides an extendable suite of test for common potential errors in a dataset.

Authors:Anne Helby Petersen [aut], Claus Thorn Ekstrøm [aut, cre]

dataMaid_1.4.0.tar.gz
dataMaid_1.4.0.zip(r-4.5)dataMaid_1.4.0.zip(r-4.4)dataMaid_1.4.0.zip(r-4.3)
dataMaid_1.4.0.tgz(r-4.4-any)dataMaid_1.4.0.tgz(r-4.3-any)
dataMaid_1.4.0.tar.gz(r-4.5-noble)dataMaid_1.4.0.tar.gz(r-4.4-noble)
dataMaid_1.4.0.tgz(r-4.4-emscripten)dataMaid_1.4.0.tgz(r-4.3-emscripten)
dataMaid.pdf |dataMaid.html
dataMaid/json (API)

# Install 'dataMaid' in R:
install.packages('dataMaid', repos = c('https://ekstroem.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/ekstroem/datamaid/issues

Datasets:
  • artData - Semi-artificial data about masterpieces of art
  • bigPresidentData - Semi-artificial data about the US presidents
  • exampleData - Example data with zero-inflated variables
  • presidentData - Semi-artificial data about the US presidents
  • testData - Extended example data to test the features of dataMaid
  • toyData - Small example data to show the features of dataMaid

On CRAN:

data-cleaningdata-screeningreproducible-research

7.48 score 143 stars 209 scripts 1.0k downloads 1 mentions 62 exports 75 dependencies

Last updated 3 years agofrom:d7571007db. Checks:OK: 1 ERROR: 6. Indexed: yes.

TargetResultDate
Doc / VignettesOKNov 02 2024
R-4.5-winERRORNov 02 2024
R-4.5-linuxERRORNov 02 2024
R-4.4-winERRORNov 02 2024
R-4.4-macERRORNov 02 2024
R-4.3-winERRORNov 02 2024
R-4.3-macERRORNov 02 2024

Exports:allCheckFunctionsallClassesallSummaryFunctionsallVisualFunctionsbasicVisualcentralValuecheckcheckFunctioncheckResultclassesclasses<-countMissingdefaultCharacterChecksdefaultCharacterSummariesdefaultDateChecksdefaultDateSummariesdefaultFactorChecksdefaultFactorSummariesdefaultHavenlabelledChecksdefaultHavenlabelledSummariesdefaultIntegerChecksdefaultIntegerSummariesdefaultLabelledChecksdefaultLabelledSummariesdefaultLogicalChecksdefaultLogicalSummariesdefaultNumericChecksdefaultNumericSummariesdescriptiondescription<-identifyCaseIssuesidentifyLonersidentifyMissingidentifyNumsidentifyOutliersidentifyOutliersTBStyleidentifyWhitespaceisCPRisEmptyisKeyisSingularisSupportedmakeCodebookmakeDataReportmessageGeneratorminMaxquartilesrefCatrendersetCheckssetSummariessetVisualsstandardVisualsummarizesummaryFunctionsummaryResulttableVisualuniqueValuesvariableTypevisualFunctionvisualizewhoami_available

Dependencies:askpassbase64encbitbit64bslibcachemclicliprcolorspacecpp11crayoncurlDEoptimRdigestevaluatefansifarverfastmapfontawesomeforcatsfsggplot2gluegridExtragtablehavenhighrhmshtmltoolshttrisobandjquerylibjsonliteknitrlabelinglatticelifecyclemagrittrMASSMatrixmemoisemgcvmimemunsellnlmeopensslpanderpillarpkgconfigprettyunitsprogressR6rappdirsRColorBrewerRcppreadrrlangrmarkdownrobustbasesassscalesstringisystibbletidyselecttinytextzdbutf8vctrsviridisLitevroomwhoamiwithrxfunyaml

Extending dataMaid

Rendered fromextending_dataMaid.Rmdusingknitr::rmarkdownon Nov 02 2024.

Last update: 2020-07-03
Started: 2017-10-13

Readme and manuals

Help Manual

Help pageTopics
Overview of all available checkFunctionsallCheckFunctions
Vector of all variable classes in 'dataMaid'allClasses
Overview of all available summaryFunctionsallSummaryFunctions
Overview of all available visualFunctionsallVisualFunctions
Semi-artificial data about masterpieces of artartData
Produce distribution plots in the base R (graphics) style using 'plot' and 'barplot'basicVisual
importFrom stats na.omitbasicVisualCFLB
Semi-artificial data about the US presidents (extended version)bigPresidentData
summaryFunction for central valuescentralValue
Perform checks of potential errors in variable/datasetcheck
Create an object of class checkFunctioncheckFunction
Create object of class checkResultcheckResult
Extract the contents of the attribute 'classes'classes classes<-
Summary function for missing valuescountMissing
Default checks for character variablesdefaultCharacterChecks
Default summary functions for character variablesdefaultCharacterSummaries
Default checks for Date variablesdefaultDateChecks
Default summary functions for Date variablesdefaultDateSummaries
Default checks for factor variablesdefaultFactorChecks
Default summary functions for factor variablesdefaultFactorSummaries
Default checks for haven_labelled variablesdefaultHavenlabelledChecks
Default summary functions for haven_labelled variablesdefaultHavenlabelledSummaries
Default checks for integer variablesdefaultIntegerChecks
Default summary functions for integer variablesdefaultIntegerSummaries
Default checks for labelled variablesdefaultLabelledChecks
Default summary functions for labelled variablesdefaultLabelledSummaries
Default checks for logical variablesdefaultLogicalChecks
Default summary functions for logical variablesdefaultLogicalSummaries
Default checks for numeric variablesdefaultNumericChecks
Default summary functions for numeric variablesdefaultNumericSummaries
Extract the contents of the attribute 'description'description description<-
Example data with zero-inflated variablesexampleData
A checkFunction for identifying case issuesidentifyCaseIssues
A checkFunction for identifying sparsely represented values (loners)identifyLoners
A checkFunction for identifying miscoded missing values.identifyMissing
A checkFunctionidentifyNums
A checkFunction for identifying outliersidentifyOutliers
A checkFunction for identifying outliers Turkey Boxstole styleidentifyOutliersTBStyle
A checkFunction for identifying whitespaceidentifyWhitespace
Check if a variable consists of Danish CPR numbersisCPR
Check if a variable qualifies as a keyisKey
Check if a variable only contains a single valueisEmpty isSingular
Check if a variable has a class supported by dataMaidisSupported
Produce a data codebookmakeCodebook
Produce a data reportmakeDataReport
Produce a message for the output of a checkFunctionmessageGenerator
summaryFunction for minimum and maximumminMax
Semi-artificial data about the US presidentspresidentData
summaryFunction for quartilesquartiles
summaryFunction that finds reference level for factor variablesrefCat
Simplified Rmarkdown renderingrender
Set check arguments for makeDataReportsetChecks
Set summary arguments for makeDataReportsetSummaries
Set visual arguments for makeDataReportsetVisuals
Produce distribution plots using ggplot from ggplot2.standardVisual
Summarize a variable/datasetsummarize
Create an object of class summaryFunctionsummaryFunction
Create object of class summaryResultsummaryResult
Produce tables for the makeDataReport visualizations.tableVisual
Extended example data to test the features of dataMaidtestData
Small example data to show the features of dataMaidtoyData
summaryFunction for unique valuesuniqueValues
Summary function for original classvariableType
Create an object of class visualFunctionvisualFunction
Produce distribution plotsvisualize
Find out if the whoami package binaries is installed (git + whoami)whoami_available