Data set for base_typevised learning. More...
#include <shark/Data/Dataset.h>
Inheritance diagram for shark::LabeledData< InputT, LabelT >:
Collaboration diagram for shark::LabeledData< InputT, LabelT >:Public Member Functions | |
| BOOST_STATIC_CONSTANT (std::size_t, DefaultBatchSize=InputContainer::DefaultBatchSize) | |
| const_element_range | elements () const |
| Returns the range of elements. More... | |
| element_range | elements () |
| Returns therange of elements. More... | |
| const_batch_range | batches () const |
| Returns the range of batches. More... | |
| batch_range | batches () |
| Returns the range of batches. More... | |
| std::size_t | numberOfBatches () const |
| Returns the number of batches of the set. More... | |
| std::size_t | numberOfElements () const |
| Returns the total number of elements. More... | |
| bool | empty () const |
| Check whether the set is empty. More... | |
| InputContainer const & | inputs () const |
| Access to inputs as a separate container. More... | |
| InputContainer & | inputs () |
| Access to inputs as a separate container. More... | |
| LabelContainer const & | labels () const |
| Access to labels as a separate container. More... | |
| LabelContainer & | labels () |
| Access to labels as a separate container. More... | |
| LabeledData () | |
| Empty data set. More... | |
| LabeledData (std::size_t numBatches) | |
| Create an empty set with just the correct number of batches. More... | |
| LabeledData (std::size_t size, element_type const &element, std::size_t batchSize=DefaultBatchSize) | |
| LabeledData (Data< InputType > const &inputs, Data< LabelType > const &labels) | |
| Construction from data. More... | |
| element_reference | element (std::size_t i) |
| const_element_reference | element (std::size_t i) const |
| batch_reference | batch (std::size_t i) |
| const_batch_reference | batch (std::size_t i) const |
| void | read (InArchive &archive) |
| from ISerializable More... | |
| void | write (OutArchive &archive) const |
| from ISerializable More... | |
| virtual void | makeIndependent () |
| This method makes the vector independent of all siblings and parents. More... | |
| virtual void | shuffle () |
| shuffles all elements in the entire dataset (that is, also across the batches) More... | |
| void | splitBatch (std::size_t batch, std::size_t elementIndex) |
| self_type | splice (std::size_t batch) |
| Splits the container into two independent parts. The left part remains in the container, the right is stored as return type. More... | |
| template<class Range > | |
| void | repartition (Range const &batchSizes) |
| Reorders the batch structure in the container to that indicated by the batchSizes vector. More... | |
| void | indexedSubset (IndexSet const &indices, self_type &subset) const |
| Fill in the subset defined by the list of indices. More... | |
| void | indexedSubset (IndexSet const &indices, self_type &subset, self_type &complement) const |
| Fill in the subset defined by the list of indices as well as its complement. More... | |
Public Member Functions inherited from shark::ISerializable | |
| virtual | ~ISerializable () |
| Virtual d'tor. More... | |
| void | load (InArchive &archive, unsigned int version) |
| Versioned loading of components, calls read(...). More... | |
| void | save (OutArchive &archive, unsigned int version) const |
| Versioned storing of components, calls write(...). More... | |
| BOOST_SERIALIZATION_SPLIT_MEMBER () | |
Protected Attributes | |
| InputContainer | m_data |
| LabelContainer | m_label |
| point data More... | |
Friends | |
| void | swap (LabeledData &a, LabeledData &b) |
Data set for base_typevised learning.
The LabeledData class extends UnlabeledData for the representation of inputs. In addition it holds and provides access to the corresponding labels.
LabeledData tries to mimic the underlying data as pairs of input and label data. this means that when accessing a batch by calling batch(splitPointber) or choosing one of the iterators one access the input batch by batch(i).input and the labels by batch(i).label
this also holds true for single element access using operator(). Be aware, that direct access to element is a linear time operation. So it is not advisable to iterate over the elements, but instead iterate over the batches.
| typedef PairRangeType< batch_type, typename InputContainer::batch_range, typename LabelContainer::batch_range >::type shark::LabeledData< InputT, LabelT >::batch_range |
| typedef boost::range_reference<batch_range>::type shark::LabeledData< InputT, LabelT >::batch_reference |
| typedef PairRangeType< batch_type, typename InputContainer::const_batch_range, typename LabelContainer::const_batch_range >::type shark::LabeledData< InputT, LabelT >::const_batch_range |
| typedef boost::range_reference<const_batch_range>::type shark::LabeledData< InputT, LabelT >::const_batch_reference |
| typedef PairRangeType< element_type, typename InputContainer::const_element_range, typename LabelContainer::const_element_range >::type shark::LabeledData< InputT, LabelT >::const_element_range |
| typedef boost::range_reference<const_element_range>::type shark::LabeledData< InputT, LabelT >::const_element_reference |
| typedef PairRangeType< element_type, typename InputContainer::element_range, typename LabelContainer::element_range >::type shark::LabeledData< InputT, LabelT >::element_range |
| typedef boost::range_reference<element_range>::type shark::LabeledData< InputT, LabelT >::element_reference |
| typedef InputContainer::IndexSet shark::LabeledData< InputT, LabelT >::IndexSet |
| typedef UnlabeledData<InputT> shark::LabeledData< InputT, LabelT >::InputContainer |
| typedef InputT shark::LabeledData< InputT, LabelT >::InputType |
| typedef Data<LabelT> shark::LabeledData< InputT, LabelT >::LabelContainer |
| typedef LabelT shark::LabeledData< InputT, LabelT >::LabelType |
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
Definition at line 605 of file Dataset.h.
Referenced by shark::binarySubProblem(), shark::createCVIndexed(), shark::detail::createCVSameSizeBalanced(), shark::KernelTargetAlignment< InputType >::evalDerivative(), shark::LabeledDataDistribution< RealVector, unsigned int >::generateDataset(), shark::SimpleNearestNeighbors< InputType, LabelType >::getNeighbors(), shark::QpBoxLinear::QpBoxLinear(), shark::QpMcLinear::QpMcLinear(), and shark::LinearRegression::train().
|
inline |
|
inline |
Returns the range of batches.
It is compatible to boost::range and STL and can be used whenever an algorithm requires element access via begin()/end() in which case data.elements() provides the correct interface
Definition at line 515 of file Dataset.h.
Referenced by shark::detail::SparseFFNetErrorWrapper< HiddenNeuron, OutputNeuron >::eval(), shark::detail::LossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::evalDerivativePointSet(), shark::detail::LossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::evalPointSet(), and shark::LDA::train().
|
inline |
| shark::LabeledData< InputT, LabelT >::BOOST_STATIC_CONSTANT | ( | std::size_t | , |
| DefaultBatchSize | = InputContainer::DefaultBatchSize |
||
| ) |
|
inline |
Definition at line 597 of file Dataset.h.
Referenced by shark::RFTrainer::buildTree(), shark::CSvmTrainer< InputType, CacheType >::computeBias(), shark::RFTrainer::createCountMatrix(), shark::CARTTrainer::createCountMatrix(), shark::LooErrorCSvm< InputType, CacheType >::eval(), shark::export_libsvm(), shark::ModifiedKernelMatrix< InputType, CacheType >::ModifiedKernelMatrix(), shark::numberOfClasses(), shark::QpBoxLinear::QpBoxLinear(), shark::QpMcLinear::QpMcLinear(), shark::KernelTargetAlignment< InputType >::setDataset(), shark::Perceptron< InputType >::train(), shark::RFTrainer::train(), shark::McSvmCSTrainer< InputType, CacheType >::train(), shark::McSvmMMRTrainer< InputType, CacheType >::train(), shark::McSvmWWTrainer< InputType, CacheType >::train(), shark::McSvmLLWTrainer< InputType, CacheType >::train(), shark::McSvmADMTrainer< InputType, CacheType >::train(), shark::McSvmATSTrainer< InputType, CacheType >::train(), and shark::McSvmATMTrainer< InputType, CacheType >::train().
|
inline |
|
inline |
Returns the range of elements.
It is compatible to boost::range and STL and can be used whenever an algorithm requires element access via begin()/end() in which case data.elements() provides the correct interface
Definition at line 500 of file Dataset.h.
Referenced by shark::Centroids::initFromData(), shark::JaakkolaHeuristic::JaakkolaHeuristic(), shark::kMeans(), main(), shark::FisherLDA::meanAndScatter(), shark::operator<<(), shark::repartitionByClass(), shark::LabeledData< InputType, LabelType >::shuffle(), shark::KernelMeanClassifier< InputType >::train(), and shark::SigmoidFitPlatt::train().
|
inline |
|
inline |
Check whether the set is empty.
Definition at line 536 of file Dataset.h.
Referenced by shark::RadiusMarginQuotient< InputType, CacheType >::eval(), shark::RadiusMarginQuotient< InputType, CacheType >::evalDerivative(), shark::LDA::train(), and shark::FisherLDA::train().
|
inline |
|
inline |
|
inline |
Access to inputs as a separate container.
Definition at line 541 of file Dataset.h.
Referenced by shark::RadiusMarginQuotient< InputType, CacheType >::computeRadiusMargin(), shark::LooErrorCSvm< InputType, CacheType >::eval(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::eval(), shark::CrossValidationError< ModelTypeT, LabelTypeT >::eval(), shark::detail::CostBasedErrorFunctionImpl< InputType, LabelType, OutputType >::eval(), shark::SvmLogisticInterpretation< InputType >::eval(), shark::RadiusMarginQuotient< InputType, CacheType >::evalDerivative(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::evalDerivative(), shark::SvmLogisticInterpretation< InputType >::evalDerivative(), experiment(), shark::export_csv(), shark::export_kernel_matrix(), shark::LassoRegression< InputVectorType >::fillData(), shark::inputDimension(), main(), run_one_trial(), shark::KernelMeanClassifier< InputType >::train(), shark::Perceptron< InputType >::train(), shark::CARTTrainer::train(), shark::RFTrainer::train(), shark::RegularizationNetworkTrainer< InputType >::train(), shark::MissingFeatureSvmTrainer< InputType, CacheType >::train(), shark::McSvmOVATrainer< InputType, CacheType >::train(), shark::McSvmCSTrainer< InputType, CacheType >::train(), shark::McSvmMMRTrainer< InputType, CacheType >::train(), shark::McSvmADMTrainer< InputType, CacheType >::train(), shark::McSvmLLWTrainer< InputType, CacheType >::train(), shark::McSvmWWTrainer< InputType, CacheType >::train(), shark::McSvmATMTrainer< InputType, CacheType >::train(), shark::McSvmATSTrainer< InputType, CacheType >::train(), shark::CSvmTrainer< InputType, CacheType >::train(), shark::EpsilonSvmTrainer< InputType, CacheType >::train(), shark::transformInputs(), and shark::transformLabels().
|
inline |
|
inline |
Access to labels as a separate container.
Definition at line 550 of file Dataset.h.
Referenced by shark::classSizes(), shark::RadiusMarginQuotient< InputType, CacheType >::computeRadiusMargin(), shark::LooErrorCSvm< InputType, CacheType >::eval(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::eval(), shark::CrossValidationError< ModelTypeT, LabelTypeT >::eval(), shark::detail::CostBasedErrorFunctionImpl< InputType, LabelType, OutputType >::eval(), shark::SvmLogisticInterpretation< InputType >::eval(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::evalDerivative(), shark::SvmLogisticInterpretation< InputType >::evalDerivative(), experiment(), shark::export_csv(), shark::export_kernel_matrix(), shark::LassoRegression< InputVectorType >::fillData(), shark::kMeans(), shark::labelDimension(), main(), shark::numberOfClasses(), run_one_trial(), shark::KernelMeanClassifier< InputType >::train(), shark::CARTTrainer::train(), shark::RegularizationNetworkTrainer< InputType >::train(), shark::McSvmCSTrainer< InputType, CacheType >::train(), shark::McSvmMMRTrainer< InputType, CacheType >::train(), shark::McSvmADMTrainer< InputType, CacheType >::train(), shark::McSvmLLWTrainer< InputType, CacheType >::train(), shark::McSvmWWTrainer< InputType, CacheType >::train(), shark::McSvmATSTrainer< InputType, CacheType >::train(), shark::McSvmATMTrainer< InputType, CacheType >::train(), shark::CSvmTrainer< InputType, CacheType >::train(), shark::transformInputs(), and shark::transformLabels().
|
inline |
|
inlinevirtual |
|
inline |
Returns the number of batches of the set.
Definition at line 527 of file Dataset.h.
Referenced by shark::binarySubProblem(), shark::detail::ParallelLossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::eval(), shark::KernelTargetAlignment< InputType >::evalDerivative(), shark::detail::ParallelLossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::evalDerivative(), shark::LassoRegression< InputVectorType >::fillData(), shark::SimpleNearestNeighbors< InputType, LabelType >::getNeighbors(), shark::QpBoxLinear::QpBoxLinear(), shark::QpMcLinear::QpMcLinear(), and shark::LinearRegression::train().
|
inline |
Returns the total number of elements.
Definition at line 531 of file Dataset.h.
Referenced by shark::RadiusMarginQuotient< InputType, CacheType >::computeRadiusMargin(), shark::RFTrainer::createCountMatrix(), shark::CARTTrainer::createCountMatrix(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::eval(), shark::SvmLogisticInterpretation< InputType >::eval(), shark::detail::ParallelLossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::eval(), shark::detail::SparseFFNetErrorWrapper< HiddenNeuron, OutputNeuron >::eval(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::evalDerivative(), shark::SvmLogisticInterpretation< InputType >::evalDerivative(), shark::detail::ParallelLossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::evalDerivative(), shark::detail::LossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::evalDerivativePointSet(), shark::detail::LossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::evalPointSet(), shark::export_libsvm(), main(), shark::FisherLDA::meanAndScatter(), shark::numberOfClasses(), shark::KernelTargetAlignment< InputType >::setDataset(), shark::Pegasos< VectorType >::solve(), shark::McPegasos< VectorType >::solve(), shark::KernelMeanClassifier< InputType >::train(), shark::Perceptron< InputType >::train(), shark::NBClassifierTrainer< InputType, OutputType >::train(), shark::CARTTrainer::train(), shark::RFTrainer::train(), shark::LinearRegression::train(), shark::LDA::train(), shark::McSvmCSTrainer< InputType, CacheType >::train(), shark::McSvmMMRTrainer< InputType, CacheType >::train(), shark::McSvmLLWTrainer< InputType, CacheType >::train(), shark::McSvmADMTrainer< InputType, CacheType >::train(), shark::McSvmWWTrainer< InputType, CacheType >::train(), shark::McSvmATSTrainer< InputType, CacheType >::train(), and shark::McSvmATMTrainer< InputType, CacheType >::train().
|
inlinevirtual |
from ISerializable
Reimplemented from shark::ISerializable.
|
inline |
Reorders the batch structure in the container to that indicated by the batchSizes vector.
After the operation the container will contain batchSizes.size() batchs with the i-th batch having size batchSize[i]. However the sum of all batch sizes must be equal to the current number of elements
Definition at line 656 of file Dataset.h.
Referenced by shark::repartitionByClass().
|
inlinevirtual |
|
inline |
Splits the container into two independent parts. The left part remains in the container, the right is stored as return type.
Order of elements remain unchanged. The SharedVector is not allowed to be shared for this to work.
|
inline |
|
inlinevirtual |
from ISerializable
Reimplemented from shark::ISerializable.
|
friend |
|
protected |
Definition at line 685 of file Dataset.h.
Referenced by shark::LabeledData< InputType, LabelType >::batch(), shark::LabeledData< InputType, LabelType >::batches(), shark::LabeledData< InputType, LabelType >::element(), shark::LabeledData< InputType, LabelType >::elements(), shark::LabeledData< InputType, LabelType >::empty(), shark::LabeledData< InputType, LabelType >::indexedSubset(), shark::LabeledData< InputType, LabelType >::inputs(), shark::LabeledData< InputType, LabelType >::makeIndependent(), shark::LabeledData< InputType, LabelType >::numberOfBatches(), shark::LabeledData< InputType, LabelType >::numberOfElements(), shark::LabeledData< InputType, LabelType >::read(), shark::LabeledData< InputType, LabelType >::repartition(), shark::LabeledData< InputType, LabelType >::splice(), shark::LabeledData< InputType, LabelType >::splitBatch(), and shark::LabeledData< InputType, LabelType >::write().
|
protected |
point data
Definition at line 686 of file Dataset.h.
Referenced by shark::LabeledData< InputType, LabelType >::batch(), shark::LabeledData< InputType, LabelType >::batches(), shark::LabeledData< InputType, LabelType >::element(), shark::LabeledData< InputType, LabelType >::elements(), shark::LabeledData< InputType, LabelType >::indexedSubset(), shark::LabeledData< InputType, LabelType >::labels(), shark::LabeledData< InputType, LabelType >::makeIndependent(), shark::LabeledData< InputType, LabelType >::read(), shark::LabeledData< InputType, LabelType >::repartition(), shark::LabeledData< InputType, LabelType >::splice(), shark::LabeledData< InputType, LabelType >::splitBatch(), and shark::LabeledData< InputType, LabelType >::write().