shark::LabeledData< InputT, LabelT > Class Template Reference

Data set for base_typevised learning. More...

#include <shark/Data/Dataset.h>

+ Inheritance diagram for shark::LabeledData< InputT, LabelT >:
+ Collaboration diagram for shark::LabeledData< InputT, LabelT >:

Public Types

typedef InputT InputType
 
typedef LabelT LabelType
 
typedef UnlabeledData< InputT > InputContainer
 
typedef Data< LabelT > LabelContainer
 
typedef InputContainer::IndexSet IndexSet
 
typedef PairRangeType
< element_type, typename
InputContainer::element_range,
typename
LabelContainer::element_range >
::type 
element_range
 
typedef PairRangeType
< element_type, typename
InputContainer::const_element_range,
typename
LabelContainer::const_element_range >
::type 
const_element_range
 
typedef PairRangeType
< batch_type, typename
InputContainer::batch_range,
typename
LabelContainer::batch_range >
::type 
batch_range
 
typedef PairRangeType
< batch_type, typename
InputContainer::const_batch_range,
typename
LabelContainer::const_batch_range >
::type 
const_batch_range
 
typedef boost::range_reference
< batch_range >::type 
batch_reference
 
typedef boost::range_reference
< const_batch_range >::type 
const_batch_reference
 
typedef boost::range_reference
< element_range >::type 
element_reference
 
typedef boost::range_reference
< const_element_range >::type 
const_element_reference
 

Public Member Functions

 BOOST_STATIC_CONSTANT (std::size_t, DefaultBatchSize=InputContainer::DefaultBatchSize)
 
const_element_range elements () const
 Returns the range of elements. More...
 
element_range elements ()
 Returns therange of elements. More...
 
const_batch_range batches () const
 Returns the range of batches. More...
 
batch_range batches ()
 Returns the range of batches. More...
 
std::size_t numberOfBatches () const
 Returns the number of batches of the set. More...
 
std::size_t numberOfElements () const
 Returns the total number of elements. More...
 
bool empty () const
 Check whether the set is empty. More...
 
InputContainer const & inputs () const
 Access to inputs as a separate container. More...
 
InputContainerinputs ()
 Access to inputs as a separate container. More...
 
LabelContainer const & labels () const
 Access to labels as a separate container. More...
 
LabelContainerlabels ()
 Access to labels as a separate container. More...
 
 LabeledData ()
 Empty data set. More...
 
 LabeledData (std::size_t numBatches)
 Create an empty set with just the correct number of batches. More...
 
 LabeledData (std::size_t size, element_type const &element, std::size_t batchSize=DefaultBatchSize)
 
 LabeledData (Data< InputType > const &inputs, Data< LabelType > const &labels)
 Construction from data. More...
 
element_reference element (std::size_t i)
 
const_element_reference element (std::size_t i) const
 
batch_reference batch (std::size_t i)
 
const_batch_reference batch (std::size_t i) const
 
void read (InArchive &archive)
 from ISerializable More...
 
void write (OutArchive &archive) const
 from ISerializable More...
 
virtual void makeIndependent ()
 This method makes the vector independent of all siblings and parents. More...
 
virtual void shuffle ()
 shuffles all elements in the entire dataset (that is, also across the batches) More...
 
void splitBatch (std::size_t batch, std::size_t elementIndex)
 
self_type splice (std::size_t batch)
 Splits the container into two independent parts. The left part remains in the container, the right is stored as return type. More...
 
template<class Range >
void repartition (Range const &batchSizes)
 Reorders the batch structure in the container to that indicated by the batchSizes vector. More...
 
void indexedSubset (IndexSet const &indices, self_type &subset) const
 Fill in the subset defined by the list of indices. More...
 
void indexedSubset (IndexSet const &indices, self_type &subset, self_type &complement) const
 Fill in the subset defined by the list of indices as well as its complement. More...
 
- Public Member Functions inherited from shark::ISerializable
virtual ~ISerializable ()
 Virtual d'tor. More...
 
void load (InArchive &archive, unsigned int version)
 Versioned loading of components, calls read(...). More...
 
void save (OutArchive &archive, unsigned int version) const
 Versioned storing of components, calls write(...). More...
 
 BOOST_SERIALIZATION_SPLIT_MEMBER ()
 

Protected Attributes

InputContainer m_data
 
LabelContainer m_label
 point data More...
 

Friends

void swap (LabeledData &a, LabeledData &b)
 

Detailed Description

template<class InputT, class LabelT>
class shark::LabeledData< InputT, LabelT >

Data set for base_typevised learning.

The LabeledData class extends UnlabeledData for the representation of inputs. In addition it holds and provides access to the corresponding labels.

LabeledData tries to mimic the underlying data as pairs of input and label data. this means that when accessing a batch by calling batch(splitPointber) or choosing one of the iterators one access the input batch by batch(i).input and the labels by batch(i).label

this also holds true for single element access using operator(). Be aware, that direct access to element is a linear time operation. So it is not advisable to iterate over the elements, but instead iterate over the batches.

Definition at line 444 of file Dataset.h.

Member Typedef Documentation

template<class InputT, class LabelT>
typedef PairRangeType< batch_type, typename InputContainer::batch_range, typename LabelContainer::batch_range >::type shark::LabeledData< InputT, LabelT >::batch_range

Definition at line 483 of file Dataset.h.

template<class InputT, class LabelT>
typedef boost::range_reference<batch_range>::type shark::LabeledData< InputT, LabelT >::batch_reference

Definition at line 491 of file Dataset.h.

template<class InputT, class LabelT>
typedef PairRangeType< batch_type, typename InputContainer::const_batch_range, typename LabelContainer::const_batch_range >::type shark::LabeledData< InputT, LabelT >::const_batch_range

Definition at line 488 of file Dataset.h.

template<class InputT, class LabelT>
typedef boost::range_reference<const_batch_range>::type shark::LabeledData< InputT, LabelT >::const_batch_reference

Definition at line 492 of file Dataset.h.

template<class InputT, class LabelT>
typedef PairRangeType< element_type, typename InputContainer::const_element_range, typename LabelContainer::const_element_range >::type shark::LabeledData< InputT, LabelT >::const_element_range

Definition at line 478 of file Dataset.h.

template<class InputT, class LabelT>
typedef boost::range_reference<const_element_range>::type shark::LabeledData< InputT, LabelT >::const_element_reference

Definition at line 494 of file Dataset.h.

template<class InputT, class LabelT>
typedef PairRangeType< element_type, typename InputContainer::element_range, typename LabelContainer::element_range >::type shark::LabeledData< InputT, LabelT >::element_range

Definition at line 473 of file Dataset.h.

template<class InputT, class LabelT>
typedef boost::range_reference<element_range>::type shark::LabeledData< InputT, LabelT >::element_reference

Definition at line 493 of file Dataset.h.

template<class InputT, class LabelT>
typedef InputContainer::IndexSet shark::LabeledData< InputT, LabelT >::IndexSet

Definition at line 453 of file Dataset.h.

template<class InputT, class LabelT>
typedef UnlabeledData<InputT> shark::LabeledData< InputT, LabelT >::InputContainer

Definition at line 451 of file Dataset.h.

template<class InputT, class LabelT>
typedef InputT shark::LabeledData< InputT, LabelT >::InputType

Definition at line 449 of file Dataset.h.

template<class InputT, class LabelT>
typedef Data<LabelT> shark::LabeledData< InputT, LabelT >::LabelContainer

Definition at line 452 of file Dataset.h.

template<class InputT, class LabelT>
typedef LabelT shark::LabeledData< InputT, LabelT >::LabelType

Definition at line 450 of file Dataset.h.

Constructor & Destructor Documentation

template<class InputT, class LabelT>
shark::LabeledData< InputT, LabelT >::LabeledData ( )
inline

Empty data set.

Definition at line 561 of file Dataset.h.

template<class InputT, class LabelT>
shark::LabeledData< InputT, LabelT >::LabeledData ( std::size_t  numBatches)
inline

Create an empty set with just the correct number of batches.

The user must initialize the dataset after that by himself.

Definition at line 567 of file Dataset.h.

template<class InputT, class LabelT>
shark::LabeledData< InputT, LabelT >::LabeledData ( std::size_t  size,
element_type const &  element,
std::size_t  batchSize = DefaultBatchSize 
)
inline

Optionally the desired batch Size can be set

Parameters
sizethe new size of the container
elementthe blueprint element from which to create the Container
batchSizethe size of the batches. if this is 0, the size is unlimited

Definition at line 577 of file Dataset.h.

template<class InputT, class LabelT>
shark::LabeledData< InputT, LabelT >::LabeledData ( Data< InputType > const &  inputs,
Data< LabelType > const &  labels 
)
inline

Construction from data.

Beware that, when calling this constructors the organization of batches must be equal in both containers. This Constructor won't split the data!

Definition at line 586 of file Dataset.h.

Member Function Documentation

template<class InputT, class LabelT>
const_batch_reference shark::LabeledData< InputT, LabelT >::batch ( std::size_t  i) const
inline

Definition at line 608 of file Dataset.h.

template<class InputT, class LabelT>
const_batch_range shark::LabeledData< InputT, LabelT >::batches ( ) const
inline

Returns the range of batches.

It is compatible to boost::range and STL and can be used whenever an algorithm requires element access via begin()/end() in which case data.elements() provides the correct interface

Definition at line 515 of file Dataset.h.

Referenced by shark::detail::SparseFFNetErrorWrapper< HiddenNeuron, OutputNeuron >::eval(), shark::detail::LossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::evalDerivativePointSet(), shark::detail::LossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::evalPointSet(), and shark::LDA::train().

template<class InputT, class LabelT>
batch_range shark::LabeledData< InputT, LabelT >::batches ( )
inline

Returns the range of batches.

It is compatible to boost::range and STL and can be used whenever an algorithm requires element access via begin()/end() in which case data.elements() provides the correct interface

Definition at line 522 of file Dataset.h.

template<class InputT, class LabelT>
shark::LabeledData< InputT, LabelT >::BOOST_STATIC_CONSTANT ( std::size_t  ,
DefaultBatchSize  = InputContainer::DefaultBatchSize 
)
template<class InputT, class LabelT>
const_element_reference shark::LabeledData< InputT, LabelT >::element ( std::size_t  i) const
inline

Definition at line 600 of file Dataset.h.

template<class InputT, class LabelT>
const_element_range shark::LabeledData< InputT, LabelT >::elements ( ) const
inline

Returns the range of elements.

It is compatible to boost::range and STL and can be used whenever an algorithm requires element access via begin()/end() in which case data.elements() provides the correct interface

Definition at line 500 of file Dataset.h.

Referenced by shark::Centroids::initFromData(), shark::JaakkolaHeuristic::JaakkolaHeuristic(), shark::kMeans(), main(), shark::FisherLDA::meanAndScatter(), shark::operator<<(), shark::repartitionByClass(), shark::LabeledData< InputType, LabelType >::shuffle(), shark::KernelMeanClassifier< InputType >::train(), and shark::SigmoidFitPlatt::train().

template<class InputT, class LabelT>
element_range shark::LabeledData< InputT, LabelT >::elements ( )
inline

Returns therange of elements.

It is compatible to boost::range and STL and can be used whenever an algorithm requires element access via begin()/end() in which case data.elements() provides the correct interface

Definition at line 507 of file Dataset.h.

template<class InputT, class LabelT>
bool shark::LabeledData< InputT, LabelT >::empty ( ) const
inline
template<class InputT, class LabelT>
void shark::LabeledData< InputT, LabelT >::indexedSubset ( IndexSet const &  indices,
self_type subset 
) const
inline

Fill in the subset defined by the list of indices.

Definition at line 670 of file Dataset.h.

template<class InputT, class LabelT>
void shark::LabeledData< InputT, LabelT >::indexedSubset ( IndexSet const &  indices,
self_type subset,
self_type complement 
) const
inline

Fill in the subset defined by the list of indices as well as its complement.

Definition at line 676 of file Dataset.h.

template<class InputT, class LabelT>
InputContainer const& shark::LabeledData< InputT, LabelT >::inputs ( ) const
inline

Access to inputs as a separate container.

Definition at line 541 of file Dataset.h.

Referenced by shark::RadiusMarginQuotient< InputType, CacheType >::computeRadiusMargin(), shark::LooErrorCSvm< InputType, CacheType >::eval(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::eval(), shark::CrossValidationError< ModelTypeT, LabelTypeT >::eval(), shark::detail::CostBasedErrorFunctionImpl< InputType, LabelType, OutputType >::eval(), shark::SvmLogisticInterpretation< InputType >::eval(), shark::RadiusMarginQuotient< InputType, CacheType >::evalDerivative(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::evalDerivative(), shark::SvmLogisticInterpretation< InputType >::evalDerivative(), experiment(), shark::export_csv(), shark::export_kernel_matrix(), shark::LassoRegression< InputVectorType >::fillData(), shark::inputDimension(), main(), run_one_trial(), shark::KernelMeanClassifier< InputType >::train(), shark::Perceptron< InputType >::train(), shark::CARTTrainer::train(), shark::RFTrainer::train(), shark::RegularizationNetworkTrainer< InputType >::train(), shark::MissingFeatureSvmTrainer< InputType, CacheType >::train(), shark::McSvmOVATrainer< InputType, CacheType >::train(), shark::McSvmCSTrainer< InputType, CacheType >::train(), shark::McSvmMMRTrainer< InputType, CacheType >::train(), shark::McSvmADMTrainer< InputType, CacheType >::train(), shark::McSvmLLWTrainer< InputType, CacheType >::train(), shark::McSvmWWTrainer< InputType, CacheType >::train(), shark::McSvmATMTrainer< InputType, CacheType >::train(), shark::McSvmATSTrainer< InputType, CacheType >::train(), shark::CSvmTrainer< InputType, CacheType >::train(), shark::EpsilonSvmTrainer< InputType, CacheType >::train(), shark::transformInputs(), and shark::transformLabels().

template<class InputT, class LabelT>
InputContainer& shark::LabeledData< InputT, LabelT >::inputs ( )
inline

Access to inputs as a separate container.

Definition at line 545 of file Dataset.h.

template<class InputT, class LabelT>
LabelContainer const& shark::LabeledData< InputT, LabelT >::labels ( ) const
inline

Access to labels as a separate container.

Definition at line 550 of file Dataset.h.

Referenced by shark::classSizes(), shark::RadiusMarginQuotient< InputType, CacheType >::computeRadiusMargin(), shark::LooErrorCSvm< InputType, CacheType >::eval(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::eval(), shark::CrossValidationError< ModelTypeT, LabelTypeT >::eval(), shark::detail::CostBasedErrorFunctionImpl< InputType, LabelType, OutputType >::eval(), shark::SvmLogisticInterpretation< InputType >::eval(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::evalDerivative(), shark::SvmLogisticInterpretation< InputType >::evalDerivative(), experiment(), shark::export_csv(), shark::export_kernel_matrix(), shark::LassoRegression< InputVectorType >::fillData(), shark::kMeans(), shark::labelDimension(), main(), shark::numberOfClasses(), run_one_trial(), shark::KernelMeanClassifier< InputType >::train(), shark::CARTTrainer::train(), shark::RegularizationNetworkTrainer< InputType >::train(), shark::McSvmCSTrainer< InputType, CacheType >::train(), shark::McSvmMMRTrainer< InputType, CacheType >::train(), shark::McSvmADMTrainer< InputType, CacheType >::train(), shark::McSvmLLWTrainer< InputType, CacheType >::train(), shark::McSvmWWTrainer< InputType, CacheType >::train(), shark::McSvmATSTrainer< InputType, CacheType >::train(), shark::McSvmATMTrainer< InputType, CacheType >::train(), shark::CSvmTrainer< InputType, CacheType >::train(), shark::transformInputs(), and shark::transformLabels().

template<class InputT, class LabelT>
LabelContainer& shark::LabeledData< InputT, LabelT >::labels ( )
inline

Access to labels as a separate container.

Definition at line 554 of file Dataset.h.

template<class InputT, class LabelT>
virtual void shark::LabeledData< InputT, LabelT >::makeIndependent ( )
inlinevirtual

This method makes the vector independent of all siblings and parents.

Definition at line 627 of file Dataset.h.

template<class InputT, class LabelT>
std::size_t shark::LabeledData< InputT, LabelT >::numberOfElements ( ) const
inline

Returns the total number of elements.

Definition at line 531 of file Dataset.h.

Referenced by shark::RadiusMarginQuotient< InputType, CacheType >::computeRadiusMargin(), shark::RFTrainer::createCountMatrix(), shark::CARTTrainer::createCountMatrix(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::eval(), shark::SvmLogisticInterpretation< InputType >::eval(), shark::detail::ParallelLossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::eval(), shark::detail::SparseFFNetErrorWrapper< HiddenNeuron, OutputNeuron >::eval(), shark::NegativeGaussianProcessEvidence< InputType, OutputType, LabelType >::evalDerivative(), shark::SvmLogisticInterpretation< InputType >::evalDerivative(), shark::detail::ParallelLossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::evalDerivative(), shark::detail::LossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::evalDerivativePointSet(), shark::detail::LossBasedErrorFunctionImpl< InputType, LabelType, OutputType >::evalPointSet(), shark::export_libsvm(), main(), shark::FisherLDA::meanAndScatter(), shark::numberOfClasses(), shark::KernelTargetAlignment< InputType >::setDataset(), shark::Pegasos< VectorType >::solve(), shark::McPegasos< VectorType >::solve(), shark::KernelMeanClassifier< InputType >::train(), shark::Perceptron< InputType >::train(), shark::NBClassifierTrainer< InputType, OutputType >::train(), shark::CARTTrainer::train(), shark::RFTrainer::train(), shark::LinearRegression::train(), shark::LDA::train(), shark::McSvmCSTrainer< InputType, CacheType >::train(), shark::McSvmMMRTrainer< InputType, CacheType >::train(), shark::McSvmLLWTrainer< InputType, CacheType >::train(), shark::McSvmADMTrainer< InputType, CacheType >::train(), shark::McSvmWWTrainer< InputType, CacheType >::train(), shark::McSvmATSTrainer< InputType, CacheType >::train(), and shark::McSvmATMTrainer< InputType, CacheType >::train().

template<class InputT, class LabelT>
void shark::LabeledData< InputT, LabelT >::read ( InArchive archive)
inlinevirtual

from ISerializable

Reimplemented from shark::ISerializable.

Definition at line 615 of file Dataset.h.

template<class InputT, class LabelT>
template<class Range >
void shark::LabeledData< InputT, LabelT >::repartition ( Range const &  batchSizes)
inline

Reorders the batch structure in the container to that indicated by the batchSizes vector.

After the operation the container will contain batchSizes.size() batchs with the i-th batch having size batchSize[i]. However the sum of all batch sizes must be equal to the current number of elements

Definition at line 656 of file Dataset.h.

Referenced by shark::repartitionByClass().

template<class InputT, class LabelT>
virtual void shark::LabeledData< InputT, LabelT >::shuffle ( )
inlinevirtual

shuffles all elements in the entire dataset (that is, also across the batches)

Definition at line 633 of file Dataset.h.

Referenced by main().

template<class InputT, class LabelT>
self_type shark::LabeledData< InputT, LabelT >::splice ( std::size_t  batch)
inline

Splits the container into two independent parts. The left part remains in the container, the right is stored as return type.

Order of elements remain unchanged. The SharedVector is not allowed to be shared for this to work.

Definition at line 647 of file Dataset.h.

template<class InputT, class LabelT>
void shark::LabeledData< InputT, LabelT >::splitBatch ( std::size_t  batch,
std::size_t  elementIndex 
)
inline

Definition at line 638 of file Dataset.h.

template<class InputT, class LabelT>
void shark::LabeledData< InputT, LabelT >::write ( OutArchive archive) const
inlinevirtual

from ISerializable

Reimplemented from shark::ISerializable.

Definition at line 621 of file Dataset.h.

Friends And Related Function Documentation

template<class InputT, class LabelT>
void swap ( LabeledData< InputT, LabelT > &  a,
LabeledData< InputT, LabelT > &  b 
)
friend

Definition at line 661 of file Dataset.h.

Member Data Documentation


The documentation for this class was generated from the following file: