Module mixture :: Class ProductDistribution
[hide private]
[frames] | no frames]

Class ProductDistribution

source code


Class for joined distributions for a vector of random variables with (possibly) different types. We assume indepenence between the features. Implements the naive Bayes Model.

Instance Methods [hide private]
 
__init__(self, distList)
Constructor
source code
 
__eq__(self, other)
Interface for the '==' operation
source code
 
__copy__(self)
Interface for the copy.copy function
source code
 
__str__(self)
String representation of the DataSet
source code
 
__getitem__(self, ind) source code
 
__setitem__(self, ind, value) source code
 
__len__(self) source code
 
pdf(self, data)
Density function.
source code
 
sample(self)
Samples a single value from the distribution.
source code
 
sampleSet(self, nr)
Samples several values from the distribution.
source code
 
sampleDataSet(self, nr)
Returns a DataSet object of size 'nr'.
source code
 
MStep(self, posterior, data, mix_pi=None)
Maximization step of the EM procedure.
source code
 
formatData(self, x)
Formats samples 'x' for inclusion into DataSet object.
source code
 
isValid(self, x)
Checks whether 'x' is a valid argument for the distribution and raises InvalidDistributionInput exception if that is not the case.
source code
 
flatStr(self, offset)
Returns the model parameters as a string compatible with the WriteMixture/ReadMixture flat file format.
source code
 
posteriorTraceback(self, x)
Returns the decoupled posterior distribution for each sample in 'x'.
source code
 
update_suff_p(self)
Updates the .suff_p field.
source code

Inherited from ProbDistribution: merge, sufficientStatistics

Method Details [hide private]

__init__(self, distList)
(Constructor)

source code 

Constructor

Parameters:
  • distList - list of ProbDistribution objects
Overrides: ProbDistribution.__init__

__eq__(self, other)
(Equality operator)

source code 

Interface for the '==' operation

Parameters:
  • other - object to be compared
Overrides: ProbDistribution.__eq__
(inherited documentation)

__copy__(self)

source code 

Interface for the copy.copy function

Overrides: ProbDistribution.__copy__
(inherited documentation)

__str__(self)
(Informal representation operator)

source code 

String representation of the DataSet

Returns:
string representation
Overrides: ProbDistribution.__str__
(inherited documentation)

pdf(self, data)

source code 

Density function. MUST accept either numpy or DataSet object of appropriate values. We use numpys as input for the atomar distributions for efficiency reasons (The cleaner solution would be to construct DataSet subset objects for the different features and we might switch over to doing that eventually).

Parameters:
  • data - DataSet object or numpy array
Returns:
log-value of the density function for each sample in 'data'
Overrides: ProbDistribution.pdf
(inherited documentation)

sample(self)

source code 

Samples a single value from the distribution.

Returns:
sampled value
Overrides: ProbDistribution.sample
(inherited documentation)

sampleSet(self, nr)

source code 

Samples several values from the distribution.

Parameters:
  • nr - number of values to be sampled.
Returns:
sampled values
Overrides: ProbDistribution.sampleSet
(inherited documentation)

sampleDataSet(self, nr)

source code 

Returns a DataSet object of size 'nr'.

Parameters:
  • nr - size of DataSet to be sampled
Returns:
DataSet object

MStep(self, posterior, data, mix_pi=None)

source code 

Maximization step of the EM procedure. Reestimates the distribution parameters using the posterior distribution and the data.

MUST accept either numpy or DataSet object of appropriate values. numpys are used as input for the atomar distributions for efficiency reasons

Parameters:
  • posterior - posterior distribution of component membership
  • data - DataSet object or 'numpy' of samples
  • mix_pi - mixture weights, necessary for MixtureModels as components.
Overrides: ProbDistribution.MStep
(inherited documentation)

formatData(self, x)

source code 

Formats samples 'x' for inclusion into DataSet object. Used by DataSet.internalInit()

Parameters:
  • x - list of samples
Returns:
two element list: first element = dimension of self, second element = sufficient statistics for samples 'x'
Overrides: ProbDistribution.formatData
(inherited documentation)

isValid(self, x)

source code 

Checks whether 'x' is a valid argument for the distribution and raises InvalidDistributionInput exception if that is not the case.

Parameters:
  • x - single sample in external representation, i.e.. an entry of DataSet.dataMatrix
Returns:
True/False flag
Overrides: ProbDistribution.isValid
(inherited documentation)

flatStr(self, offset)

source code 

Returns the model parameters as a string compatible with the WriteMixture/ReadMixture flat file format.

Parameters:
  • offset - number of ' ' characters to be used in the flatfile.
Overrides: ProbDistribution.flatStr
(inherited documentation)

posteriorTraceback(self, x)

source code 

Returns the decoupled posterior distribution for each sample in 'x'. Used for analysis of clustering results.

Parameters:
  • x - list of samples
Returns:
decoupled posterior
Overrides: ProbDistribution.posteriorTraceback
(inherited documentation)

update_suff_p(self)

source code 

Updates the .suff_p field.

Overrides: ProbDistribution.update_suff_p
(inherited documentation)