Archive for the ‘cocktail party’ Category

Speech Scholar

Saturday, October 11th, 2008

STFT and ISTFT

Wednesday, September 17th, 2008

Problem: I need to invert a spectrogram to listen to the signal. And I want to do it the best way possible.

Solution: It isn’t all that sophisticated. Here’s a limited implementation of Yang’s ICASSP 2008 solution; which is — surprise surprise — quite nearly the same as the ubiquitous heuristic overlap and add istft.

It’s limited in the following ways:

  • The least-squares solution is not implemented. This means that LS solution is available only for regularly spaced frequency bands since in that case LS solution reduces to p=2 solution.
  • MATLAB fft/ifft routines are used; and they take care of the P matrix (excess zeros to round to powers of 2).
  • The matrix algebra is not exploited — this is a quick and dirty test run.

(more…)

Scalescape

Sunday, August 24th, 2008

scalescape

Get PRAAT Pitch into MATLAB/Python

Thursday, August 7th, 2008

Problem: I want to compare my pitch extraction algorithm to PRAAT’s

Briefly: Write a praat script to extract script and dump into a text file. Eat up into your favorite environment.

Solution:
The following get_praat_pitch.m calls the extract_pitch.praat script.


function praatpitch=get_praat_pitch(snd, fs)

% TODO hardcoded location of praat
        fprintf('WARNING get_praat_pitch: Hardcoded location of praat and extract_pitch.praatn');
        global MSOROOTDIR
        SNDDIR=[MSOROOTDIR, 'sep', filesep];
        FILENM=['praat-tmp'];
        FULLFILENM=[SNDDIR, FILENM, '.wav'];
        wavwrite(snd, fs, FULLFILENM);

        if(ispc()),
                PRAATCOMM=['"', MSOROOTDIR, 'extraneous/praatcon-win', '"'];
        elseif(isunix()),
                PRAATCOMM=[MSOROOTDIR, 'extraneous/praat'];
        end;

        praatcommand=[PRAATCOMM,' "',MSOROOTDIR,...
                'helpers/extract_pitch.praat" ',...
                FILENM, '.wav', ' ', FILENM, '.Pitch ', SNDDIR];

        retval=system(praatcommand);
        delete(FULLFILENM);
        if(retval==0),
                fprintf('Got pitch from praat!n');
                praatpitch=load([SNDDIR, FILENM, '.Pitch']);
                fprintf('Pitches read...n');
        else,
                fprintf('Praat error occurred.n');
                input('Press any key to continue...');
        end;

extract_pitch.praat

# Extract pitch from file (first parameter) and write it out to a text file
# (second parameter).

form PitchExtractor
        sentence sound_file_name
        sentence pitch_file_name
        sentence Directory
endform

echo Reading from 'Directory$''sound_file_name$'
Read from file... 'Directory$''sound_file_name$'
To Pitch... 0.0 75 600

pitchID = selected("Pitch");
Down to PitchTier
pitchtierID = selected("PitchTier")
num_points = Get number of points

filedelete 'Directory$''pitch_file_name$'
echo Writing to 'Directory$''pitch_file_name$'
for i to num_points
        time = Get time from index... i
        hertz = Get value at index... i
        fileappend "'Directory$''pitch_file_name$'" 'time' 'hertz' 'newline$'
endfor

Virtual Cocktail Party

Thursday, August 7th, 2008

Problem: I want a cocktail party NOW!!!

OK: play multiple media/audio streams dynamically localized in space.

Briefly: Compile spatialization plug-in for mplayer and run multiple instances with different parameters.

Solution:
It’s a hack, it’s a fix, it makes for a real party, especially when you play those a o scott podcasts — with the same person going on in three streams!

Compile the filter into mplayer, by putting in af_spatialize.c in the list of audio filters compiled in libaf:

Code af_spatialize.c
Header carried over from af_hrtf.c listing some constants af_spatialize.h
Header with head-related transfer functions for various directions hrtf_22050_3.h

With the spatialization plugin, a file can be localized at angles compiled into the plugin.


cocktail.sh 45 file1.mp3 &
cocktail.sh -45 file2.mp3 &

#!/bin/bash
IFS=#
~/mplayer/mplayer -af volnorm,resample=22050,spatialize=${1} ${2} 2>&1 > /dev/null &

Coherence Horse

Sunday, June 8th, 2008

Coherence (and history) plotted against direct to net energy ratio

It’s in Pegasus’ league.

Information Theoretic Feature Selection for Clustering

Saturday, May 3rd, 2008

Appropriate feature selection and weighting is crucial for clustering algorithms to successfully handle multi-dimensional data. A feature is relevant when it is correlated with the classification, mutually independent of other features, but possibly correlated with other features. For any feature, these characteristics, and hence the weighting, can be determined using information theoretic quantities, e.g., mutual information with other features and the veridical cluster assignment available from training data. An application of the technique to feature weighting in a speech separation task is presented.


(more…)

Auditory Neuroscience Lab logo, redux

Thursday, January 24th, 2008

Lab logos

It is not the aim of my dissertation to design a lab logo. (It is actually better described by the blog title and in the “about” link at top right corner).

BU Science Day 2007: Cocktail Party with R2D2

Thursday, November 22nd, 2007

sciday poster

Classify this!

Thursday, November 22nd, 2007

Two sources (red and green) are male and female, -45 and +45 degrees in azimuth, speaking simultaneously, have various features like pitch and location, that are scattered about as much as is shown in the figure. We have no issues hearing out one from the other. Neither will NEXUS 6.

time-frequency pixels of two sources with various features