Once my boss asked me to review an Open CDISC report for SDTM dataset packages. I wasn’t sure what to do at first. With the help of my colleagues, I gradually develop a sense of how to review the CDISC report.
I’d like to share some review processes and common Open CDISC Summary from PINNACLE21.
OPEN CDISC Format: Excel sheet including Dataset Summary, Issue Summary, Details, Rules, and General information such as Configuration, Define.xml, Generated Date, Engine Version, MedDRA Version 23.1, Terminology Version.
Dataset Summary: Including Processed Sources, Domain, Label, Class, Source, Records number, Errors, Warnings, Notice.
Issue Summary: Including Source dataset, Rule ID, Message, Severity, Found, Explanation.
We are focusing on explaining the item has Severity equals to Error or Warning.
Message Severity Explanation
SD1082 : variable length is too long for actual data Error
Company x has defined a fixed length for standard variables across the domain.
SD0013: xSTDTC is after xENDTCErrorData Issue, subjid=xxx, reported to DMSD1117
SD1201Duplicate RecordsWarningxSPID is used for unique records
For records xStart Date is after xEnd Date, we will check in the Raw Data Level to identify the subject and event level.
We can also cross-check with other datasets to see if similar issues happen multiple times.
Details: List the severity messages with respect to Variable values, records numbers.
Rule: Describe each Rule ID and their belonging Categories (Terminology, Presence, Format, Consistency, Limit)