Experiment

Description

Default class defining an experiment. An experiment object can contain multiple subjects, which themselves can contain multiple sequences.

Initialisation

class krajjat.classes.experiment.Experiment(name=None)

Creates and return an Experiment object. An Experiment object can contain multiple Subject instances, which in turn contain Trial instances.

New in version 2.0.

Parameters:

name (str, optional) – The name of the experiment.

name

The name of the experiment.

Type:

str or None

subjects

An ordered dictionary of Subject instances.

Type:

OrderedDict(subject)

Example

>>> experiment = Experiment("My experiment")
>>> subject = Subject("Lambert")
>>> experiment.add_subject(subject)

Magic methods

Experiment.__len__()

Returns the total amount of subjects present in the Experiment instance.

New in version 2.0.

Returns:

The amount of subjects present in the Experiment instance.

Return type:

int

Example

>>> experiment = Experiment("My experiment")
>>> experiment.add_subject(Subject("Raphaël"))
>>> experiment.add_subject(Subject("Chris"))
>>> len(experiment)
2
Experiment.__getitem__(name)

Returns a subject, given a name.

New in version 2.0.

Parameters:

name (str|int) – The name of the subject to retrieve.

Returns:

A Subject instance (if a Subject with the given name exists).

Return type:

Subject

Example

>>> subject = Subject("Mathilde")
>>> experiment = Experiment("My experiment")
>>> experiment.add_subject(subject)
>>> experiment["Mathilde"]

Name methods

Experiment.set_name(name)

Sets the attribute Experiment.name of the Experiment instance.

New in version 2.0.

Parameters:

name (str) – The name of the experiment.

Example

>>> experiment = Experiment()
>>> experiment.set_name("My experiment")
Experiment.get_name()

Returns the attribute name of the Experiment instance.

New in version 2.0.

Returns:

  • str – The name of the experiment.

  • Experiment

  • ———-

  • >>> experiment = Experiment(“My experiment”)

  • >>> experiment.get_name()

  • My experiment

Subject methods

Experiment.add_subject(subject, replace_if_exists=False)

Adds a Subject instance to the subject.

New in version 2.0.

Parameters:
  • subject (Subject) – A Subject instance.

  • replace_if_exists (bool, optional) – If set on False, the function will raise an exception if a subject with the same name already exists in the experiment instance.

Note

Subjects must have a name; adding a subject without a name will result in an error.

Example

>>> experiment = Experiment("My experiment")
>>> experiment.add_subject(Subject("Carole-Anne"))
Experiment.add_subjects(*subjects, replace_if_exists=False)

Adds multiple Subject instances to the experiment.

New in version 2.0.

Parameters:
  • subjects (Subject) – One or many Subject instances.

  • replace_if_exists (bool, optional) – If set on False, the function will raise an exception if a subject with the same name already exists in the experiment instance.

Example

>>> experiment = Experiment("My experiment")
>>> subject1 = Subject("Maxime")
>>> subject2 = Subject("Laurène")
>>> subject3 = Subject("Margaux")
>>> experiment.add_subjects(subject1, subject2, subject3)
Experiment.get_subject(name)

Returns the Subject instance corresponding to the given name.

New in version 2.0.

Parameters:

name (str) – The name of the subject to retrieve.

Example

>>> experiment = Experiment("My experiment")
>>> subject = Subject("Stéphane")
>>> experiment.add_subject(subject)
>>> experiment.get_subject("Stéphane")
Experiment.get_subjects(names=None, group=None, return_type='dict', **kwargs)

Returns the complete list or a sublist of subjects from the Experiment instance.

New in version 2.0.

Parameters:
  • names (list(str), optional) – A list of subject names. If provided, the returned Subject instances will be a sublist of the subjects from the experiment.

  • group (str, int or None, optional) – If provided, the function will return the subjects having a Subject.group equal to the provided value. In the case were a list of subject names is also provided, the subjects in that list and belonging to that group will be returned.

  • return_type (str, optional) – Defines if to return the subjects as a dictionary or a list. The default value is "dict", which returns the subjects as a dictionary with the subject names as keys. If set to "list", the function will return a list containing the subject in the same order as their names passed as parameter.

Note

The function also allows for other arguments, as long as each argument is a custom attribute set to the Subject instances.

Returns:

A list of Subject instances.

Return type:

list(Subject)

Example

>>> experiment = Experiment("My experiment")
>>> subject1 = Subject("Chloé", group="Control")
>>> subject2 = Subject("Jérôme", group="Experimental")
>>> subject3 = Subject("Étienne", group="Control")
>>> experiment.add_subjects(subject1, subject2, subject3)
>>> experiment.get_subjects()  # Returns all the subjects
>>> experiment.get_subjects(group="Control")  # Returns all the subjects from the group "Control" (1 and 3)
>>> experiment.get_subjects(["Chloé", "Jérôme"])  # Returns subjects 1 and 2
Experiment.get_subjects_names(group=None, **kwargs)

Returns a list containing the attributes subject.Subject.name for each subject.

New in version 2.0.

Returns:

  • list(str) – A list of all the subjects names.

  • group (str, int or None, optional) – If provided, the function will return the subjects having a Subject.group equal to the provided value. In the case were a list of subject names is also provided, the subjects in that list and belonging to that group will be returned.

Note

The function also allows for other arguments, as long as each argument is a custom attribute set to the Subject instances.

Example

>>> experiment = Experiment("My experiment")
>>> subject1 = Subject("Tom")
>>> subject2 = Subject("Floriane")
>>> subject3 = Subject("Hélène")
>>> experiment.add_subjects(subject1, subject2, subject3)
>>> experiment.get_subjects_names()
["Tom", "Floriane", "Hélène"]
Experiment.remove_subject(name)

Removes a subject from the Experiment instance corresponding to the given name.

New in version 2.0.

Parameters:

name (str) – The name of the subject to remove.

Example

>>> experiment = Experiment("My experiment")
>>> subject1 = Subject("Arthur")
>>> experiment.remove_subject("Arthur")
Experiment.get_number_of_subjects(group=None, **kwargs)

Returns the length of the subjects attribute, or the number of subjects in the experiment matching the group or other attributes.

New in version 2.0.

Returns:

  • int – The length of the subjects attribute.

  • group (str, int or None, optional) – If provided, the function will return the subjects having a Subject.group equal to the provided value. In the case were a list of subject names is also provided, the subjects in that list and belonging to that group will be returned.

Note

The function also allows for other arguments, as long as each argument is a custom attribute set to the Subject instances.

Example

>>> experiment = Experiment("My experiment")
>>> subject1 = Subject("Sacha")
>>> subject2 = Subject("Tallulah")
>>> subject3 = Subject("Mika")
>>> experiment.add_subjects(subject1, subject2, subject3)
>>> experiment.get_number_of_subjects()
3

Other getter

Experiment.get_joint_labels()

Returns the joint labels from the first sequence of the first subject of the experiment.

New in version 2.0.

Returns:

A list of joint labels.

Return type:

list(str)

Example

>>> experiment = Experiment("My experiment")
>>> subject1 = Subject("Robin")
>>> trial1 = Trial("R001", sequence=Sequence("Robin/sequence_001.txt"))  # Head
>>> trial2 = Trial("R002", sequence=Sequence("Robin/sequence_002.txt"))  # Head, HandRight
>>> subject1.add_trials(trial1, trial2)
>>> subject2 = Subject("Alain")
>>> trial1 = Trial("R001", sequence=Sequence("Alain/sequence_001.txt"))  # HandRight, HandLeft
>>> trial2 = Trial("R002", sequence=Sequence("Alain/sequence_002.txt"))  # KneeRight
>>> subject2.add_trials(trial1, trial2)
>>> experiment.add_subjects(subject1, subject2)
>>> experiment.get_joint_labels()
["Head", "HandRight", "HandLeft", "KneeRight"]

Dataframe methods

Experiment.get_dataframe(sequence_measure='distance', audio_measure='envelope', sampling_frequency=None, exclude_columns=None, include_columns=None, verbosity=1, **kwargs)

Returns the data from the experiment as a Pandas dataframe containing multiple columns.

New in version 2.0.

Parameters:
  • sequence_measure (str or list(str), optional) –

    The time series to be returned, can be one or a combination of the following:

    • For the x-coordinate: "x", "x_coord", "coord_x", or "x_coordinate".

    • For the y-coordinate: "y", "y_coord", "coord_y", or "y_coordinate".

    • For the z-coordinate: "z", "z_coord", "coord_z", or "z_coordinate".

    • For all the coordinates: "xyz", "coordinates", "coord", "coords", or "coordinate".

    • For the consecutive distances: "d", "distances", "dist", "distance", or 0.

    • For the consecutive distances on the x-axis: "dx", "distance_x", "x_distance", "dist_x", or "x_dist".

    • For the consecutive distances on the y-axis: "dy", "distance_y", "y_distance", "dist_y", or "y_dist".

    • For the consecutive distances on the z-axis: "dz", "distance_z", "z_distance", "dist_z", or "z_dist".

    • For the velocity: "v", "vel", "velocity", "velocities", "speed", or 1.

    • For the acceleration: "a", "acc", "acceleration", "accelerations", or 2.

    • For the jerk: "j", "jerk", or 3.

    • For the snap: "s", "snap", "joust" or 4.

    • For the crackle: "c", "crackle", or 5.

    • For the pop: "p", "pop", or 6.

    • For any derivative of a higher order, set the corresponding integer.

  • audio_measure (str, list(str) or None, optional) –

    The time series to be returned, can be one or a combination of the following:

    • "audio", for the original sample values.

    • "envelope"

    • "pitch"

    • "f1", "f2", "f3", "f4", "f5" for the values of the corresponding formant

    • "intensity"

    This value can also be None: in that case, none of the data will be affected to audio measures. If the subjects’ trials contain audio instances, and this parameter is set on an audio derivative, the audio derivative will be calculated, which can take some time. If the audio measure is set on an AudioDerivative and the subjects’ trials contain the same AudioDerivative, the data will be taken as-is. Any other case will return an error.

  • sampling_frequency (float, optional) – The frequency at which to resample the two measures before adding them to the dataframe. By default, no resampling is applied. If the sampling frequency of the Sequence and its corresponding Audio or AudioDerivative differ, the function will return an exception.

  • exclude_columns (list or None, optional) –

    A list of the columns to exclude from the dataframe. By default, the columns included are:

    • "subject", containing the subject Subject.name.

    • "group", containing the Subject.group attribute of the Subject instances.

    • "trial", containing the Trial.trial_id attribute of each Trial instance.

    • "condition", containing the Trial.condition attribute of each Trial instance.

    • "modality"`, containing ``"mocap" for data from Sequence instances, "audio" for data from Audio instances, or "pca" or "ica" for data from PCA or ICA computations.

    • "label", containing the name of each feature: either the name of a joint label, "audio", or a PCA/ICA number.

    • "measure", indicating the selected sequence measure (e.g. "distance") or audio measure (e.g. "envelope")

    • "timestamp", containing the timestamps for each Trial.

    • "value", containing the measure value of the current modality label at the given timestamp.

  • include_columns (list or None, optional) –

    A list of columns to include to the dataframe. The function will try to find attributes matching the given column names in the Subject and the Trial instances.

    Note

    The function first tries to find the attribute value in the Subject instance, then in the Trial instances. If there is a match in the Subject instance, the function will not try to find an attribute value in the Trial instances, so ensure that you have unique attribute names between Subject and Trial instances. Moreover, if a specific attribute cannot be found for a specific Subject or Trial, the function sets the values to numpy.nan.

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Note

The function also allows to set a series af other attributes related to the generation of an Audio or AudioDerivative object. For example, if you choose “envelope” as an audio measure, you can add parameters used to generate an envelope in the function Audio.get_envelope(), such as "window_size" or "filter_below"; these will be directly used when generating the envelope. To know which parameters can be used for each audio derivative, please refer to the corresponding methods in the Audio: • Audio.filter_and_resample()Audio.get_envelope()Audio.get_pitch()Audio.get_formant()Audio.get_intensity()

Experiment.save_dataframe(folder_out='', name='dataframe', file_format='pkl', sequence_measure='distance', audio_measure='envelope', sampling_frequency=None, exclude_columns=None, include_columns=None, verbosity=1, **kwargs)

Saves a dataframe to disk.

New in version 2.0.

Parameters:
  • folder_out (str, optional) – The path to the folder where to save the dataframe. If one or more subfolders of the path do not exist, the function will create them. If the string provided is empty (by default), the sequence will be saved in the current working directory. If the string provided contains a file with an extension, the fields name and file_format will be ignored.

  • name (str, optional) – Defines the name of the file or files where to save the dataframe. By default, it is set on “dataframe”.

  • file_format (str, optional) –

    The file format in which to save the sequence. The file format must be "pkl" (default), "gzip", "json" (default), "xlsx", "txt", "csv", "tsv", or, if you are a masochist, "mat". Notes:

    • "xls" will save the file with an .xlsx extension.

    • Excel files have a limited amount of rows, which may not be compatible with big datasets.

    • Any string starting with a dot will be accepted (e.g. ".csv" instead of "csv").

    • "csv;" will force the value separator on ;, while "csv," will force the separator on ,. By default, the function will detect which separator the system uses.

    • "txt" and "tsv" both separate the values by a tabulation.

    • Any other string will not return an error, but rather be used as a custom extension. The data will be saved as in a text file (using tabulations as values separators).

  • sequence_measure (str or list(str), optional) –

    The time series to be returned, can be one or a combination of the following:

    • For the x-coordinate: "x", "x_coord", "coord_x", or "x_coordinate".

    • For the y-coordinate: "y", "y_coord", "coord_y", or "y_coordinate".

    • For the z-coordinate: "z", "z_coord", "coord_z", or "z_coordinate".

    • For all the coordinates: "xyz", "coordinates", "coord", "coords", or "coordinate".

    • For the consecutive distances: "d", "distances", "dist", "distance", or 0.

    • For the consecutive distances on the x-axis: "dx", "distance_x", "x_distance", "dist_x", or "x_dist".

    • For the consecutive distances on the y-axis: "dy", "distance_y", "y_distance", "dist_y", or "y_dist".

    • For the consecutive distances on the z-axis: "dz", "distance_z", "z_distance", "dist_z", or "z_dist".

    • For the velocity: "v", "vel", "velocity", "velocities", "speed", or 1.

    • For the acceleration: "a", "acc", "acceleration", "accelerations", or 2.

    • For the jerk: "j", "jerk", or 3.

    • For the snap: "s", "snap", "joust" or 4.

    • For the crackle: "c", "crackle", or 5.

    • For the pop: "p", "pop", or 6.

    • For any derivative of a higher order, set the corresponding integer.

  • audio_measure (str) –

    The time series to be returned, can be either:

    • "audio", for the original sample values.

    • "envelope"

    • "pitch"

    • "f1", "f2", "f3", "f4", "f5" for the values of the corresponding formant

    • "intensity"

  • sampling_frequency (float, optional) – The frequency at which to resample the two measures before adding them to the dataframe. By default, no resampling is applied. If the sampling frequency of the Sequence and its corresponding Audio differ, the function will return an exception.

  • exclude_columns (list or None, optional) –

    A list of the columns to exclude from the dataframe. By default, the columns included are:

    • "subject", containing the subject Subject.name.

    • "group", containing the Subject.group attribute of the Subject instances.

    • "trial", containing the Trial.trial_id attribute of each Trial instance.

    • "condition", containing the Trial.condition attribute of each Trial instance.

    • "joint_label", containing the name of each joint label for each sequence.

    • "timestamp", containing the timestamps for each Trial.

    • A column containing the selected sequence measure, which will be either "x", "y", "z", "distance_hands", "distance", "distance_x", "distance_y", "distance_z", "velocity", "acceleration", or "acceleration_abs".

    • A column containing the selected audio measure, either "audio", "envelope", "pitch", "f1", "f2", "f3", "f4", "f5", or "intensity".

  • include_columns (list or None, optional) –

    A list of columns to include to the dataframe. The function will try to find attributes matching the given column names in the Subject and the Trial instances.

    Note

    The function first tries to find the attribute value in the Subject instance, then in the Trial instances. If there is a match in the Subject instance, the function will not try to find an attribute value in the Trial instances, so ensure that you have unique attribute names between Subject and Trial instances. Moreover, if a specific attribute cannot be found for a specific Subject or Trial, the function sets the values to numpy.nan.

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Note

The function also allows to set a series af other attributes related to the generation of an Audio or AudioDerivative object. For example, if you choose “envelope” as an audio measure, you can add parameters used to generate an envelope in the function Audio.get_envelope(), such as "window_size" or "filter_below"; these will be directly used when generating the envelope. To know which parameters can be used for each audio derivative, please refer to the corresponding methods in the Audio:

  • Audio.filter_and_resample()

  • Audio.get_envelope()

  • Audio.get_pitch()

  • Audio.get_formant()

  • Audio.get_intensity()