Skip to content
Snippets Groups Projects
Commit 6c27e517 authored by sanchezi's avatar sanchezi
Browse files

improve doc & README & DESCRIPTION + add verbose

parent ef50b711
No related branches found
No related tags found
No related merge requests found
Pipeline #68427 passed
......@@ -2,8 +2,8 @@ Package: kfino
Title: Kalman Filter for Impulse Noised Outliers
Version: 1.0.0
Authors@R: c(
person("Bertrand", "Cloez", email = "bertrand.cloez@inrae.fr", role = c("aut", "cre")),
person("Isabelle", "Sanchez", email = "isabelle.sanchez@inrae.fr", role = c("ctr")),
person("Bertrand", "Cloez", email = "bertrand.cloez@inrae.fr", role = c("aut")),
person("Isabelle", "Sanchez", email = "isabelle.sanchez@inrae.fr", role = c("aut", "cre")),
person("Benedicte", "Fontez", email = "benedicte.fontez@supagro.fr", role = c("ctr")))
Author: Bertrand Cloez [aut, cre],
Isabelle Sanchez [ctr],
......@@ -20,14 +20,14 @@ BugReports: https://forgemia.inra.fr/isabelle.sanchez/kfino/issues
Imports:
ggplot2,
dplyr,
foreach,
doParallel,
parallel
Suggests:
rmarkdown,
knitr,
testthat (>= 3.0.0),
covr
covr,
foreach,
doParallel,
parallel
VignetteBuilder: knitr
RoxygenNote: 7.2.1
Config/testthat/edition: 3
#' a dataset containing the WoW weighing for one animal of 203 observations
#' a dataset containing the WoW weighing for one animal of 203 observations.
#' https://doi.org/10.1016/j.compag.2018.08.022
#'
#' A dataset for kfino algorithm
#' @format a data.frame
......@@ -13,7 +14,8 @@
#' }
"spring1"
#' a dataset containing the WoW weighing for one animal (merinos lamb) of 397 observations
#' a dataset containing the WoW weighing for one animal (merinos lamb) of 397
#' observations. https://doi.org/10.1016/j.compag.2018.08.022
#'
#' A dataset for kfino algorithm
#' @format a data.frame
......@@ -29,7 +31,7 @@
"merinos1"
#' a dataset containing the WoW weighing for one animal (merinos lamb) of 345
#' observations, difficult to model
#' observations, difficult to model. https://doi.org/10.1016/j.compag.2018.08.022
#'
#' A dataset for kfino algorithm
#' @format a data.frame
......@@ -44,7 +46,8 @@
#' }
"merinos2"
#' a dataset containing the WoW weighing for 4 animals of 1296 observations
#' a dataset containing the WoW weighing for 4 animals of 1296 observations,
#' https://doi.org/10.1016/j.compag.2018.08.022
#'
#' A dataset for kfino algorithm
#' @format a data.frame
......
......@@ -13,11 +13,11 @@
#'
#' @details The produced graphic can be, according to typeG:
#' \describe{
#' \item{quali}{The detection of outliers with a qualitative rule: OK values,
#' KO values (outliers) and OOR values (out of range values defined
#' by the user in `kfino_fit`) }
#' \item{quanti}{The detection of outliers with a quantitative display using
#' the calculated probability of the kfino algorithm}
#' \item{quali}{This plot shows the detection of outliers with a qualitative
#' rule: OK values (black), KO values (outliers, purple) and OOR values
#' (out of range values defined by the user in `kfino_fit`, red) }
#' \item{quanti}{This plot shows the detection of outliers with a quantitative
#' display using the calculated probability of the kfino algorithm}
#' \item{prediction}{This plot shows the prediction of the analyzed variable
#' plus the OK values. Prediction corresponds to E[X_{t} | Y_{1...t}]
#' for each time point t. Between 2 time points, we used a simple
......
......@@ -15,6 +15,7 @@
#' default 10
#' @param kappaOpt numeric, truncation setting for initial parameters'
#' optimization, default 7
#' @param verbose write stuff if TRUE (optional), default FALSE.
#'
#' @details The initialization parameter list `param` contains:
#' \describe{
......@@ -37,8 +38,8 @@
#' \item{seqp}{numeric vector, sequence of pp probability to be correctly
#' weighted. default seq(0.5,0.7,0.1)}
#' }
#' It has to be given by the user following his knowledge of the animal or
#' the data set. All parameters are compulsory except m0, mm and pp that can be
#' It should be given by the user based on their knowledge of the animal or the
#' data set. All parameters are compulsory except m0, mm and pp that can be
#' optimized by the algorithm. In the optimization step, those three parameters
#' are initialized according to the input data (between the expert
#' range) using quantile of the Y distribution (varying between 0.2 and 0.8 for
......@@ -53,22 +54,27 @@
#'
#' @return a S3 list with two data frames and a list of vectors of
#' kfino results
#' \describe{
#' \item{detectOutlier}{The whole input data set with the detected outliers
#' flagged and prediction}
#'
#' @return detectOutlier: The whole input data set with the detected outliers
#' flagged and the prediction of the analyzed variable.
#' the following columns are joined to the columns
#' present in the input data set:
#' \describe{
#' \item{prediction}{the parameter of interest - Yvar - predicted}
#' \item{label_pred}{the probability of the value being well predicted}
#' \item{lwr}{lower bound of the confidence interval of the predicted value}
#' \item{upper}{upper bound of the confidence interval of the predicted value}
#' \item{upr}{upper bound of the confidence interval of the predicted value}
#' \item{flag}{flag of the value (OK value, KO value (outlier), OOR value
#' (out of range values defined by the user in `kfino_fit`)}
#' }
#' \item{PredictionOK}{A dataset with the predictions on possible values (OK
#' and KO values)}
#' \item{kfino.results}{kfino results (a list of vectors) on optimized input
#' parameters or not}
#' }
#' @return PredictionOK: A subset of `detectOutlier` data set with the predictions
#' of the analyzed variable on possible values (OK and KO values)
#' @return kfino.results: kfino results (a list of vectors containing the
#' prediction of the analyzed variable, the probability to be an
#' outlier, the likelihood, the confidence interval of
#' the prediction and the flag of the data) on input parameters that
#' were optimized if the user chose this option
#'
#' @export
#' @examples
#' data(spring1)
......@@ -90,14 +96,16 @@
#'
#' resu1<-kfino_fit(datain=spring1,
#' Tvar="dateNum",Yvar="Poids",
#' doOptim=TRUE,method="ML",param=param1)
#' doOptim=TRUE,method="ML",param=param1,
#' verbose=TRUE)
#' Sys.time() - t0
#'
#' # --- With Optimization on initial parameters - EM method
#' t0 <- Sys.time()
#' resu1b<-kfino_fit(datain=spring1,
#' Tvar="dateNum",Yvar="Poids",
#' doOptim=TRUE,method="EM",param=param1)
#' doOptim=TRUE,method="EM",param=param1,
#' verbose=TRUE)
#' Sys.time() - t0
#'
#' # --- Without Optimization on initial parameters
......@@ -116,7 +124,8 @@
#' resu2<-kfino_fit(datain=spring1,
#' Tvar="dateNum",Yvar="Poids",
#' param=param2,
#' doOptim=FALSE)
#' doOptim=FALSE,
#' verbose=FALSE)
#' Sys.time() - t0
#'
#' # complex data on merinos2 dataset
......@@ -141,7 +150,8 @@
kfino_fit<-function(datain,Tvar,Yvar,
param=NULL,
doOptim=TRUE,method="ML",
threshold=0.5,kappa=10,kappaOpt=7){
threshold=0.5,kappa=10,kappaOpt=7,
verbose=FALSE){
if( any(is.null(param[["expertMin"]]) |
is.null(param[["expertMax"]])) )
......@@ -206,16 +216,20 @@ kfino_fit<-function(datain,Tvar,Yvar,
if (N > 500){
# optim with sub-sampling
print("-------:")
print("Optimization of initial parameters ")
print("with sub-sampling and ML method - result:")
if (verbose){
print("-------:")
print("Optimization of initial parameters ")
print("with sub-sampling and ML method - result:")
}
bornem0=quantile(Y[1:N/4], probs = c(.2, .8))
m0opt=quantile(Y[1:N/4], probs = c(.5))
mmopt=quantile(Y[(3*N/4):N], probs = c(.5))
cat("range m0: ",bornem0,"\n")
cat("Initial m0opt: ",m0opt,"\n")
cat("Initial mmopt: ",mmopt,"\n")
if (verbose){
cat("range m0: ",bornem0,"\n")
cat("Initial m0opt: ",m0opt,"\n")
cat("Initial mmopt: ",mmopt,"\n")
}
popt=0.5
#--- Saving datain before sub-sampling
YY=Y
......@@ -274,11 +288,13 @@ kfino_fit<-function(datain,Tvar,Yvar,
Y=YY
Tps=TpsTps
N=NN
print("Optimized parameters with ML method: ")
cat("Optimized m0: ",m0opt,"\n")
cat("Optimized mm: ",mmopt,"\n")
cat("Optimized pp: ",popt,"\n")
print("-------:")
if (verbose){
print("Optimized parameters with ML method: ")
cat("Optimized m0: ",m0opt,"\n")
cat("Optimized mm: ",mmopt,"\n")
cat("Optimized pp: ",popt,"\n")
print("-------:")
}
resultat=KBO_known(param=list(mm=mmopt,
pp=popt,
......@@ -295,11 +311,13 @@ kfino_fit<-function(datain,Tvar,Yvar,
} else if (N > 50){
# optimization without sub-sampling, 2 methods, EM or ML
if (method == "EM"){
if (verbose){
print("-------:")
print("Optimization of initial parameters with EM method - result:")
print("no sub-sampling performed:")
bornem0=quantile(Y[1:N/2], probs = c(.2, .8))
cat("range m0: ",bornem0,"\n")
}
bornem0=quantile(Y[1:N/2], probs = c(.2, .8))
if (verbose) cat("range m0: ",bornem0,"\n")
#--- par dichotomie
# borne basse
......@@ -448,11 +466,13 @@ kfino_fit<-function(datain,Tvar,Yvar,
popt<-popt_low
}
print("Optimized parameters with EM method: ")
cat("Optimized m0: ",m0opt,"\n")
cat("Optimized mm: ",mmopt,"\n")
cat("Optimized pp: ",popt,"\n")
print("-------:")
if (verbose){
print("Optimized parameters with EM method: ")
cat("Optimized m0: ",m0opt,"\n")
cat("Optimized mm: ",mmopt,"\n")
cat("Optimized pp: ",popt,"\n")
print("-------:")
}
resultat=KBO_known(param=list(mm=mmopt,
pp=popt,
m0=m0opt,
......@@ -466,17 +486,20 @@ kfino_fit<-function(datain,Tvar,Yvar,
threshold=threshold,Y=Y,Tps=Tps,N=N,kappa=kappa)
} else if (method == "ML"){
print("-------:")
print("Optimization of initial parameters with ML method - result:")
print("no sub-sampling performed:")
if (verbose){
print("-------:")
print("Optimization of initial parameters with ML method - result:")
print("no sub-sampling performed:")
}
bornem0=quantile(Y[1:N/4], probs = c(.2, .8))
m0opt=quantile(Y[1:N/4], probs = c(.5))
mmopt=quantile(Y[(3*N/4):N], probs = c(.5))
cat("range m0: ",bornem0,"\n")
cat("initial m0opt: ",m0opt,"\n")
cat("initial mmopt: ",mmopt,"\n")
if (verbose){
cat("range m0: ",bornem0,"\n")
cat("initial m0opt: ",m0opt,"\n")
cat("initial mmopt: ",mmopt,"\n")
}
popt=0.5
Vopt=KBO_L(list(m0=m0opt,
......@@ -515,11 +538,13 @@ kfino_fit<-function(datain,Tvar,Yvar,
}
}
print("Optimized parameters: ")
cat("Optimized m0: ",m0opt,"\n")
cat("Optimized mm: ",mmopt,"\n")
cat("Optimized pp: ",popt,"\n")
print("-------:")
if (verbose){
print("Optimized parameters: ")
cat("Optimized m0: ",m0opt,"\n")
cat("Optimized mm: ",mmopt,"\n")
cat("Optimized pp: ",popt,"\n")
print("-------:")
}
resultat=KBO_known(param=list(mm=mmopt,
pp=popt,
m0=m0opt,
......@@ -542,12 +567,15 @@ kfino_fit<-function(datain,Tvar,Yvar,
} else {
X<-c(m0,pp,mm)
}
print("-------:")
print("Optimization of initial parameters - result:")
print("Not enough data => No optimization performed:")
print("Used parameters: ")
print(X)
print("-------:")
if (verbose){
print("-------:")
print("Optimization of initial parameters - result:")
print("Not enough data => No optimization performed:")
print("Used parameters: ")
print(X)
print("-------:")
}
resultat=KBO_known(param=list(m0=X[[1]],
pp=X[[2]],
mm=X[[3]],
......@@ -574,10 +602,12 @@ kfino_fit<-function(datain,Tvar,Yvar,
} else {
X<-c(m0,pp,mm)
}
print("-------:")
print("No optimization of initial parameters:")
print("Used parameters: ")
print(X)
if (verbose){
print("-------:")
print("No optimization of initial parameters:")
print("Used parameters: ")
print(X)
}
resultat=KBO_known(param=list(m0=X[[1]],
pp=X[[2]],
mm=X[[3]],
......
......@@ -2,13 +2,13 @@
# kfino <img src='man/figures/logo.png' align="right" height="139" />
Kalman Filter for Impulse Noised Outliers
The **kfino** algorithm was developped for time courses in order to detect impulse noised outliers and predict the parameter of interest mainly for data recorded on the walk-over-weighing system described in this publication:
OBJECTIVE AND DESCRIPTION ALGO
E.González-García *et. al.* (2018) A mobile and automated walk-over-weighing system for a close and remote monitoring of liveweight in sheep. vol 153: 226-238. https://doi.org/10.1016/j.compag.2018.08.022
CREER SITE PKGDOWN
**Kalman filter with impulse noised outliers** (kfino) is a robust sequential algorithm allowing to filter data with a large number of outliers. This algorithm is based on simple latent linear Gaussian processes as in the Kalman Filter method and is devoted to detect impulse-noised outliers. These are data points that differ significantly from other observations.
**in progress**
The method is described in full details in the following arxiv preprint: https://arxiv.org/abs/2208.00961.
## Installation
......@@ -30,10 +30,23 @@ library(kfino)
help(package="kfino")
```
Please, have a look to the vignettes that explain how to use the algorithm. The
main specifications are:
* filtering data with a large number of outliers
* predicting the analyzed variable
* providing useful graphics to interpret the data
![quali](man/figures/kfino_plot_quali.png)
![quanti](man/figures/kfino_plot_quanti.png)
![pred](man/figures/kfino_plot_pred.png)
## Citation
As a lot of time and effort were spent in creating the kfino algorithm, please cite it when using it for data analysis:
XXX
https://arxiv.org/abs/2208.00961.
See also citation() for citing R itself.
......
......@@ -34,8 +34,20 @@ param2<-list(m0=41,
resu2<-kfino_fit(datain=spring1,
Tvar="dateNum",Yvar="Poids",
param=param2,
doOptim=FALSE)
doOptim=FALSE,
verbose=TRUE)
## -----------------------------------------------------------------------------
# structure of detectOutlier data set
str(resu2$detectOutlier)
# head of PredictionOK data set
head(resu2$PredictionOK)
# structure of kfino.results list
str(resu2$kfino.results)
## -----------------------------------------------------------------------------
# flags are qualitative
kfino_plot(resuin=resu2,typeG="quali",
Tvar="Day",Yvar="Poids",Ident="IDE")
......@@ -60,7 +72,9 @@ param1<-list(m0=NULL,
resu1<-kfino_fit(datain=spring1,
Tvar="dateNum",Yvar="Poids",
doOptim=TRUE,param=param1)
param=param1,
doOptim=TRUE,
verbose=TRUE)
# flags are qualitative
kfino_plot(resuin=resu1,typeG="quali",
......@@ -99,7 +113,8 @@ param3<-list(m0=NULL,
resu3<-kfino_fit(datain=merinos1,
Tvar="dateNum",Yvar="Poids",
doOptim=TRUE,param=param3)
doOptim=TRUE,param=param3,
verbose=TRUE)
# flags are qualitative
kfino_plot(resuin=resu3,typeG="quali",
......
......@@ -58,9 +58,9 @@ dim(spring1)
head(spring1)
```
The range weight of this animal is between 30 and 75 kg and must be given in the initial parameters of the `kfino_fit()`function.
The range weight of this animal is between 30 and 75 kg and must be given in `param`, a list of initial parameters to include in the `kfino_fit()` function call.
The user can either perform an outlier detection (and prediction) given initial parameters or on optimized initial parameters (on m0, mm and pp):
The user can either perform an outlier detection (and prediction) given initial parameters or on optimized initial parameters (on m0, mm and pp). `param` list is composed of:
* m0 = (optional) the initial weight, NULL if the user wants to optimize it,
* mm = (optional) the target weight, NULL if the user wants to optimize it,
......@@ -77,7 +77,7 @@ The user can either perform an outlier detection (and prediction) given initial
# Kfino algorithm on the `spring1` dataset
## Parameters (m0, mm and pp) not optimized
If the user chooses to not optimize the initial parameters, all the list must be completed according to expert knowledge of the data set.
If the user chooses to not optimize the initial parameters, all the list must be completed according to expert knowledge of the data set. Here, the user supposes that the initial weight is around 41 and the target one around 45.
```{r,error=TRUE}
# --- Without Optimisation on parameters
......@@ -96,8 +96,39 @@ param2<-list(m0=41,
resu2<-kfino_fit(datain=spring1,
Tvar="dateNum",Yvar="Poids",
param=param2,
doOptim=FALSE)
doOptim=FALSE,
verbose=TRUE)
```
resu2 is a list of 3 elements:
* detectOutlier: The whole input data set with the detected outliers flagged and the prediction of the analyzed variable. the following columns are joined to the columns present in the input data set:
- prediction: the parameter of interest - Yvar - predicted
- label_pred: the probability of the value being well predicted
- lwr: lower bound of the confidence interval of the predicted value
- upr: upper bound of the confidence interval of the predicted value
- flag: flag of the value (OK value, KO value (outlier), OOR value
(out of range values defined by the user in `kfino_fit`)
* PredictionOK: A subset of `detectOutlier` data set with the predictions
of the analyzed variable on possible values (OK and KO values)
* kfino.results: kfino results (a list of vectors, prediction, probability to be an outlier , likelihood, confidence interval of the prediction and the flag of the data) on input parameters that were optimized if the user chooses this option
```{r}
# structure of detectOutlier data set
str(resu2$detectOutlier)
# head of PredictionOK data set
head(resu2$PredictionOK)
# structure of kfino.results list
str(resu2$kfino.results)
```
Using the `kfino_plot()`function allows the user to visualize the results:
```{r}
# flags are qualitative
kfino_plot(resuin=resu2,typeG="quali",
Tvar="Day",Yvar="Poids",Ident="IDE")
......@@ -107,6 +138,8 @@ kfino_plot(resuin=resu2,typeG="quanti",
Tvar="Day",Yvar="Poids",Ident="IDE")
```
## Parameters (m0, mm and pp) optimized
If the user chooses to optimize the initial parameters, m0, mm and pp must be set to NULL.
......@@ -127,7 +160,9 @@ param1<-list(m0=NULL,
resu1<-kfino_fit(datain=spring1,
Tvar="dateNum",Yvar="Poids",
doOptim=TRUE,param=param1)
param=param1,
doOptim=TRUE,
verbose=TRUE)
# flags are qualitative
kfino_plot(resuin=resu1,typeG="quali",
......@@ -178,7 +213,8 @@ param3<-list(m0=NULL,
resu3<-kfino_fit(datain=merinos1,
Tvar="dateNum",Yvar="Poids",
doOptim=TRUE,param=param3)
doOptim=TRUE,param=param3,
verbose=TRUE)
# flags are qualitative
kfino_plot(resuin=resu3,typeG="quali",
......
This diff is collapsed.
man/figures/kfino_plot_pred.png

93.6 KiB

man/figures/kfino_plot_quali.png

91.1 KiB

man/figures/kfino_plot_quanti.png

93.4 KiB

......@@ -22,11 +22,15 @@ Useful links:
}
\author{
\strong{Maintainer}: Bertrand Cloez \email{bertrand.cloez@inrae.fr}
\strong{Maintainer}: Isabelle Sanchez \email{isabelle.sanchez@inrae.fr}
Authors:
\itemize{
\item Bertrand Cloez \email{bertrand.cloez@inrae.fr}
}
Other contributors:
\itemize{
\item Isabelle Sanchez \email{isabelle.sanchez@inrae.fr} [contractor]
\item Benedicte Fontez \email{benedicte.fontez@supagro.fr} [contractor]
}
......
......@@ -13,7 +13,8 @@ kfino_fit(
method = "ML",
threshold = 0.5,
kappa = 10,
kappaOpt = 7
kappaOpt = 7,
verbose = FALSE
)
}
\arguments{
......@@ -41,26 +42,34 @@ default 10}
\item{kappaOpt}{numeric, truncation setting for initial parameters'
optimization, default 7}
\item{verbose}{write stuff if TRUE (optional), default FALSE.}
}
\value{
a S3 list with two data frames and a list of vectors of
kfino results
\describe{
\item{detectOutlier}{The whole input data set with the detected outliers
flagged and prediction}
detectOutlier: The whole input data set with the detected outliers
flagged and the prediction of the analyzed variable.
the following columns are joined to the columns
present in the input data set:
\describe{
\item{prediction}{the parameter of interest - Yvar - predicted}
\item{label_pred}{the probability of the value being well predicted}
\item{lwr}{lower bound of the confidence interval of the predicted value}
\item{upper}{upper bound of the confidence interval of the predicted value}
\item{upr}{upper bound of the confidence interval of the predicted value}
\item{flag}{flag of the value (OK value, KO value (outlier), OOR value
(out of range values defined by the user in `kfino_fit`)}
}
\item{PredictionOK}{A dataset with the predictions on possible values (OK
and KO values)}
\item{kfino.results}{kfino results (a list of vectors) on optimized input
parameters or not}
}
PredictionOK: A subset of `detectOutlier` data set with the predictions
of the analyzed variable on possible values (OK and KO values)
kfino.results: kfino results (a list of vectors containing the
prediction of the analyzed variable, the probability to be an
outlier, the likelihood, the confidence interval of
the prediction and the flag of the data) on input parameters that
were optimized if the user chose this option
}
\description{
kfino_fit a function to detect outlier with a Kalman Filtering approach
......@@ -87,8 +96,8 @@ The initialization parameter list `param` contains:
\item{seqp}{numeric vector, sequence of pp probability to be correctly
weighted. default seq(0.5,0.7,0.1)}
}
It has to be given by the user following his knowledge of the animal or
the data set. All parameters are compulsory except m0, mm and pp that can be
It should be given by the user based on their knowledge of the animal or the
data set. All parameters are compulsory except m0, mm and pp that can be
optimized by the algorithm. In the optimization step, those three parameters
are initialized according to the input data (between the expert
range) using quantile of the Y distribution (varying between 0.2 and 0.8 for
......@@ -117,14 +126,16 @@ param1<-list(m0=NULL,
resu1<-kfino_fit(datain=spring1,
Tvar="dateNum",Yvar="Poids",
doOptim=TRUE,method="ML",param=param1)
doOptim=TRUE,method="ML",param=param1,
verbose=TRUE)
Sys.time() - t0
# --- With Optimization on initial parameters - EM method
t0 <- Sys.time()
resu1b<-kfino_fit(datain=spring1,
Tvar="dateNum",Yvar="Poids",
doOptim=TRUE,method="EM",param=param1)
doOptim=TRUE,method="EM",param=param1,
verbose=TRUE)
Sys.time() - t0
# --- Without Optimization on initial parameters
......@@ -143,7 +154,8 @@ param2<-list(m0=41,
resu2<-kfino_fit(datain=spring1,
Tvar="dateNum",Yvar="Poids",
param=param2,
doOptim=FALSE)
doOptim=FALSE,
verbose=FALSE)
Sys.time() - t0
# complex data on merinos2 dataset
......
......@@ -43,11 +43,11 @@ kfino_plot a graphical function for the result of a kfino run
\details{
The produced graphic can be, according to typeG:
\describe{
\item{quali}{The detection of outliers with a qualitative rule: OK values,
KO values (outliers) and OOR values (out of range values defined
by the user in `kfino_fit`) }
\item{quanti}{The detection of outliers with a quantitative display using
the calculated probability of the kfino algorithm}
\item{quali}{This plot shows the detection of outliers with a qualitative
rule: OK values (black), KO values (outliers, purple) and OOR values
(out of range values defined by the user in `kfino_fit`, red) }
\item{quanti}{This plot shows the detection of outliers with a quantitative
display using the calculated probability of the kfino algorithm}
\item{prediction}{This plot shows the prediction of the analyzed variable
plus the OK values. Prediction corresponds to E[X_{t} | Y_{1...t}]
for each time point t. Between 2 time points, we used a simple
......
......@@ -3,7 +3,8 @@
\docType{data}
\name{lambs}
\alias{lambs}
\title{a dataset containing the WoW weighing for 4 animals of 1296 observations}
\title{a dataset containing the WoW weighing for 4 animals of 1296 observations,
https://doi.org/10.1016/j.compag.2018.08.022}
\format{
a data.frame
\describe{
......
......@@ -3,7 +3,8 @@
\docType{data}
\name{merinos1}
\alias{merinos1}
\title{a dataset containing the WoW weighing for one animal (merinos lamb) of 397 observations}
\title{a dataset containing the WoW weighing for one animal (merinos lamb) of 397
observations. https://doi.org/10.1016/j.compag.2018.08.022}
\format{
a data.frame
\describe{
......
......@@ -4,7 +4,7 @@
\name{merinos2}
\alias{merinos2}
\title{a dataset containing the WoW weighing for one animal (merinos lamb) of 345
observations, difficult to model}
observations, difficult to model. https://doi.org/10.1016/j.compag.2018.08.022}
\format{
a data.frame
\describe{
......
......@@ -3,7 +3,8 @@
\docType{data}
\name{spring1}
\alias{spring1}
\title{a dataset containing the WoW weighing for one animal of 203 observations}
\title{a dataset containing the WoW weighing for one animal of 203 observations.
https://doi.org/10.1016/j.compag.2018.08.022}
\format{
a data.frame
\describe{
......
......@@ -58,9 +58,9 @@ dim(spring1)
head(spring1)
```
The range weight of this animal is between 30 and 75 kg and must be given in the initial parameters of the `kfino_fit()`function.
The range weight of this animal is between 30 and 75 kg and must be given in `param`, a list of initial parameters to include in the `kfino_fit()` function call.
The user can either perform an outlier detection (and prediction) given initial parameters or on optimized initial parameters (on m0, mm and pp):
The user can either perform an outlier detection (and prediction) given initial parameters or on optimized initial parameters (on m0, mm and pp). `param` list is composed of:
* m0 = (optional) the initial weight, NULL if the user wants to optimize it,
* mm = (optional) the target weight, NULL if the user wants to optimize it,
......@@ -77,7 +77,7 @@ The user can either perform an outlier detection (and prediction) given initial
# Kfino algorithm on the `spring1` dataset
## Parameters (m0, mm and pp) not optimized
If the user chooses to not optimize the initial parameters, all the list must be completed according to expert knowledge of the data set.
If the user chooses to not optimize the initial parameters, all the list must be completed according to expert knowledge of the data set. Here, the user supposes that the initial weight is around 41 and the target one around 45.
```{r,error=TRUE}
# --- Without Optimisation on parameters
......@@ -96,8 +96,39 @@ param2<-list(m0=41,
resu2<-kfino_fit(datain=spring1,
Tvar="dateNum",Yvar="Poids",
param=param2,
doOptim=FALSE)
doOptim=FALSE,
verbose=TRUE)
```
resu2 is a list of 3 elements:
* detectOutlier: The whole input data set with the detected outliers flagged and the prediction of the analyzed variable. the following columns are joined to the columns present in the input data set:
- prediction: the parameter of interest - Yvar - predicted
- label_pred: the probability of the value being well predicted
- lwr: lower bound of the confidence interval of the predicted value
- upr: upper bound of the confidence interval of the predicted value
- flag: flag of the value (OK value, KO value (outlier), OOR value
(out of range values defined by the user in `kfino_fit`)
* PredictionOK: A subset of `detectOutlier` data set with the predictions
of the analyzed variable on possible values (OK and KO values)
* kfino.results: kfino results (a list of vectors, prediction, probability to be an outlier , likelihood, confidence interval of the prediction and the flag of the data) on input parameters that were optimized if the user chooses this option
```{r}
# structure of detectOutlier data set
str(resu2$detectOutlier)
# head of PredictionOK data set
head(resu2$PredictionOK)
# structure of kfino.results list
str(resu2$kfino.results)
```
Using the `kfino_plot()`function allows the user to visualize the results:
```{r}
# flags are qualitative
kfino_plot(resuin=resu2,typeG="quali",
Tvar="Day",Yvar="Poids",Ident="IDE")
......@@ -107,6 +138,8 @@ kfino_plot(resuin=resu2,typeG="quanti",
Tvar="Day",Yvar="Poids",Ident="IDE")
```
## Parameters (m0, mm and pp) optimized
If the user chooses to optimize the initial parameters, m0, mm and pp must be set to NULL.
......@@ -127,7 +160,9 @@ param1<-list(m0=NULL,
resu1<-kfino_fit(datain=spring1,
Tvar="dateNum",Yvar="Poids",
doOptim=TRUE,param=param1)
param=param1,
doOptim=TRUE,
verbose=TRUE)
# flags are qualitative
kfino_plot(resuin=resu1,typeG="quali",
......@@ -178,7 +213,8 @@ param3<-list(m0=NULL,
resu3<-kfino_fit(datain=merinos1,
Tvar="dateNum",Yvar="Poids",
doOptim=TRUE,param=param3)
doOptim=TRUE,param=param3,
verbose=TRUE)
# flags are qualitative
kfino_plot(resuin=resu3,typeG="quali",
......
......@@ -71,7 +71,8 @@ param1<-list(m0=NULL,
```{r,error=TRUE}
resu1<-kfino_fit(datain=spring1,
Tvar="dateNum",Yvar="Poids",
doOptim=TRUE,method="ML",param=param1)
doOptim=TRUE,method="ML",param=param1,
verbose=TRUE)
# flags are qualitative
kfino_plot(resuin=resu1,typeG="quali",
......@@ -92,7 +93,8 @@ kfino_plot(resuin=resu1,typeG="prediction",
resu2<-kfino_fit(datain=spring1,
Tvar="dateNum",Yvar="Poids",
doOptim=TRUE,method="EM",param=param1)
doOptim=TRUE,method="EM",param=param1,
verbose=TRUE)
# flags are qualitative
kfino_plot(resuin=resu2,typeG="quali",
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment