Archive for the ‘dsp’ Category

Speech Scholar

Saturday, October 11th, 2008

STFT and ISTFT

Wednesday, September 17th, 2008

Problem: I need to invert a spectrogram to listen to the signal. And I want to do it the best way possible.

Solution: It isn’t all that sophisticated. Here’s a limited implementation of Yang’s ICASSP 2008 solution; which is — surprise surprise — quite nearly the same as the ubiquitous heuristic overlap and add istft.

It’s limited in the following ways:

  • The least-squares solution is not implemented. This means that LS solution is available only for regularly spaced frequency bands since in that case LS solution reduces to p=2 solution.
  • MATLAB fft/ifft routines are used; and they take care of the P matrix (excess zeros to round to powers of 2).
  • The matrix algebra is not exploited — this is a quick and dirty test run.

(more…)

Get PRAAT Pitch into MATLAB/Python

Thursday, August 7th, 2008

Problem: I want to compare my pitch extraction algorithm to PRAAT’s

Briefly: Write a praat script to extract script and dump into a text file. Eat up into your favorite environment.

Solution:
The following get_praat_pitch.m calls the extract_pitch.praat script.


function praatpitch=get_praat_pitch(snd, fs)

% TODO hardcoded location of praat
        fprintf('WARNING get_praat_pitch: Hardcoded location of praat and extract_pitch.praatn');
        global MSOROOTDIR
        SNDDIR=[MSOROOTDIR, 'sep', filesep];
        FILENM=['praat-tmp'];
        FULLFILENM=[SNDDIR, FILENM, '.wav'];
        wavwrite(snd, fs, FULLFILENM);

        if(ispc()),
                PRAATCOMM=['"', MSOROOTDIR, 'extraneous/praatcon-win', '"'];
        elseif(isunix()),
                PRAATCOMM=[MSOROOTDIR, 'extraneous/praat'];
        end;

        praatcommand=[PRAATCOMM,' "',MSOROOTDIR,...
                'helpers/extract_pitch.praat" ',...
                FILENM, '.wav', ' ', FILENM, '.Pitch ', SNDDIR];

        retval=system(praatcommand);
        delete(FULLFILENM);
        if(retval==0),
                fprintf('Got pitch from praat!n');
                praatpitch=load([SNDDIR, FILENM, '.Pitch']);
                fprintf('Pitches read...n');
        else,
                fprintf('Praat error occurred.n');
                input('Press any key to continue...');
        end;

extract_pitch.praat

# Extract pitch from file (first parameter) and write it out to a text file
# (second parameter).

form PitchExtractor
        sentence sound_file_name
        sentence pitch_file_name
        sentence Directory
endform

echo Reading from 'Directory$''sound_file_name$'
Read from file... 'Directory$''sound_file_name$'
To Pitch... 0.0 75 600

pitchID = selected("Pitch");
Down to PitchTier
pitchtierID = selected("PitchTier")
num_points = Get number of points

filedelete 'Directory$''pitch_file_name$'
echo Writing to 'Directory$''pitch_file_name$'
for i to num_points
        time = Get time from index... i
        hertz = Get value at index... i
        fileappend "'Directory$''pitch_file_name$'" 'time' 'hertz' 'newline$'
endfor

Virtual Cocktail Party

Thursday, August 7th, 2008

Problem: I want a cocktail party NOW!!!

OK: play multiple media/audio streams dynamically localized in space.

Briefly: Compile spatialization plug-in for mplayer and run multiple instances with different parameters.

Solution:
It’s a hack, it’s a fix, it makes for a real party, especially when you play those a o scott podcasts — with the same person going on in three streams!

Compile the filter into mplayer, by putting in af_spatialize.c in the list of audio filters compiled in libaf:

Code af_spatialize.c
Header carried over from af_hrtf.c listing some constants af_spatialize.h
Header with head-related transfer functions for various directions hrtf_22050_3.h

With the spatialization plugin, a file can be localized at angles compiled into the plugin.


cocktail.sh 45 file1.mp3 &
cocktail.sh -45 file2.mp3 &

#!/bin/bash
IFS=#
~/mplayer/mplayer -af volnorm,resample=22050,spatialize=${1} ${2} 2>&1 > /dev/null &

Coherence Horse

Sunday, June 8th, 2008

Coherence (and history) plotted against direct to net energy ratio

It’s in Pegasus’ league.

Frequency Glides

Friday, May 30th, 2008

Problem I want to generate a sound with frequency glides.

Wild frequency glides!

Solution A frequency glide is a sound whose frequency evolves with time. Once the frequency is characterized, i.e. frequency at any point in time is known, a signal can be generated using a co/sine function.
(more…)

Fast Cross-correlogram

Monday, May 26th, 2008

Problem I want to compute cross-correlogram. Fast. Can it not be done quick and dirty in the spectral domain?

Solution Cross-correlation is straightforward to compute in spectral domain as ft(sig1)*conj(ft(sig2)). Computing the cross-correlogram requires successive windowed cross-correlation. MATLAB specgram does just that: can we leverage it?
(more…)

Organized Sources of Knowledge Online

Tuesday, February 12th, 2008

Poisson Control

Friday, November 9th, 2007

Problem Digital camera photograph is noisy.

Discussion This noise is generally modeled as a summation of Gaussian noise, constant “dark noise”, and Poisson noise. In situations where Poisson noise is the limiting factor, point process modeling of noise as spatial Poisson process can be used to estimate the signal.

Under these assumptions it comes down to: smooth over large areas when signal intensity is high, small areas when it’s low.

What more can the Poisson model yield?