Team The Pianist:

•

Carlos Macasaet

Published May 25, 2016 © CC BY

The Pianist

The Pianist is your personal music assistant. Use it to help tune your instrument and warm up your vocals.

IntermediateProtip15 hours1,253

Third Place

Hey Alexa! The Amazon Alexa Skill Contest

Things used in this project

Hardware components

Piano

Any piano will do, provided it is in tune. The standard is to tune the A4 to 440 Hz although some orchestras choose to tune higher.

Sound Recorder

We used the Voice Memo app on an iPhone.

Software apps and online services

Amazon Web Services AWS Lambda

Amazon Developer Account

Web Host

The host must support HTTPS with an Amazon-trusted SSL certificate. We used GitHub Pages.

UNIX-like development environment

We used Mac OS X 10.10.5 and installed the software we needed using Homebrew.

audio processing software

We used FFmpeg.

Story

The Pianist is a skill to assist musicians with their everyday tasks. It can give you a pitch when need to tune your instrument. For singers, it can provide a vocal warmup that goes as low as C3 (130.81 Hz) and as high as G6 (1568.0 Hz).

Lean Approach

We took a lean approach to creating this skill -- using the build, measure, learn, repeat cycle. We delivered the skill in three iterations. The first release of the skill could only play an A and do a simple ascending warmup. An A is sufficient to tune most instruments, and the simple warmup was comprehensive enough for many singers.

To measure the results of each release, we got a quantitative measurement of usage, using AWS CloudWatch metrics for the λ. We also got qualitative feedback on the skill through the reviews in the Alexa app.

After the initial release, we observed that people were willing to use the skill and that it was worthy of further development. Subsequent iterations introduced the ability to play the other 11 notes in a 12-tone chromatic scale and the ability to continue warming up by going higher, lower, back up, back down, and repeating.

Record Sound Clip

The first step is to record a sound clip using the piano and sound recorder. The total playback time of the audio, plus any synthesized speech, cannot exceed 90 seconds. It was for this reason that the warmup had to be split into multiple chunks. This technical limitation ended up making the skill more versatile -- allowing singers to customize their warmup routine.

Format Sound Clip

The audio file needs to be in MPEG version 2 (mp3) format. It may not be obvious, but this implies the sample rate must be either 22050 Hz, 24000 Hz, or 16000 Hz. In addition, the bit rate must be 48 kbps. More details are available in the Alexa Skills Kit documentation. I used the following FFmpeg command to convert an audio clip:

ffmpeg -y -i c4_-_c5.m4a -ar 16000 -ab 48k -codec:a libmp3lame -ac 1 c4_-_c5.mp3

Upload to web host

The audio file needs to be hosted on a publicly-accessible host via HTTPS using an Amazon-trusted SSL certificate. Again, more details are available in the documentation.

Challenges

We encountered several unexpected challenges while developing the skill.

Pronunciation

First, while handling the boundary cases for the warmups, we noticed that the Echo had trouble pronouncing foreign names like Popoli di Tessaglia and Die Entfürung aus dem Serail. Fortunately, the Alexa Skills Kit supports a sub set of the International Phonetic Alphabet (IPA). Although it does not support the full set of phones necessary to correctly pronounce the original Italian and German, we managed to synthesize an acceptable American pronunciation of the names using:

po.po.li di tɛ.ˈsɑljə

ɛnt.ˈfuɹʊŋ aus dem sɛˈɹaɪ

Unexpected Slot Values

Finally, but observing invocation errors in the CloudWatch metrics and digging into the CloudWatch logs, we were surprised to find that sometimes the slot values provided to the λ do not exactly match any of the custom slot values we defined. For example, we have a custom slot type that has 21 possible values that represent the 12 notes along with their alternative names (e.g. we have both "D Sharp" and "E Flat"). However, sometimes the slot value included punctuation (e.g. "f."). Sometimes it had invalid note names that did not correspond to any of the defined slot values (e.g. "scale"). Finally, sometimes it had extraneous articles (e.g. "a c" and "a c sharp"). To address these issues, we added unit tests to our suite that replicated these problems and then modified the code accordingly.

Cold Starts

One thing we noticed while testing the skill was that sometimes the Echo would take a long time to respond. Subsequent invocations would be much faster. This was corroborated by the CloudWatch metrics. Although most of the λ invocations completed in under 1s, there were many outliers that took on the order of 5.5s and one that took up to 7s. This is an unacceptable user experience.

Since we had decided to build the λ in Java, we thought that the culprit might be a combination of copying the archive, decompressing it, and loading classes into memory. To address class loading, we removed unnecessary uses of library classes such as the string utilities and validators from Apache commons-lang. To improve the time to handle the archive, we used ProGuard to shrink the size of the jar by removing unnecessary classes. It took many iterations to get the right classes excluded without breaking essential functionality. In the end, this is the configuration we used:

-dontobfuscate -dontoptimize -dontwarn ch.qos.logback.**,org.joda.** -keep class com.macasaet.** { *; } -keep class com.amazon.** { *; } -keep class com.amazonaws.** { *; } -keepclassmembers enum * { *; } -keepattributes InnerClasses,EnclosingMethod,Signature,*Annotation*

We managed to shrink the archive from 3.1mb to 1.8mb... This had absolutely no effect on the cold start problem. When testing the λ in the AWS console after uploading it, we still observed 5+ second invocations.

Finally, we increased the amount of memory available to the λ, although the maximum memory it ever uses is 35mb. We raised the available memory from 128mb to 512mb. By increasing the memory, we increase the share of the underlying hardware's compute resources allocated to the λ.

This had a tangible effect. In the AWS Console, the initial test invocation of the λ dropped to 2.5 seconds. In addition, the impact was obvious from the CloudWatch metrics as pictured below. Prior to increasing the RAM, there were outlier invocation times from 5.5 seconds to over 7 seconds. Similarly, the 6 hour moving average duration peaked at 2.6 seconds. After increasing the RAM, the outlier invocation times dropped to at most 1.7 seconds. The 6 hour moving average duration dropped to a peak of 674 milliseconds. This makes for a much more seamless experience for the musician.

Fixing the Cold Start Problem

We hope you enjoy The Pianist! Let us know what you think.

Demonstration

Code

package com.macasaet.pianist.speechlet;

import static com.amazon.speech.speechlet.SpeechletResponse.newAskResponse;
import static com.amazon.speech.speechlet.SpeechletResponse.newTellResponse;
import static com.macasaet.pianist.speechlet.Intents.BAROQUE_MUSIC_INTENT;
import static com.macasaet.pianist.speechlet.Intents.HELP_INTENT;
import static com.macasaet.pianist.speechlet.SessionAttribute.ACTIVITY;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.amazon.speech.slu.Intent;
import com.amazon.speech.speechlet.IntentRequest;
import com.amazon.speech.speechlet.LaunchRequest;
import com.amazon.speech.speechlet.Session;
import com.amazon.speech.speechlet.SessionEndedRequest;
import com.amazon.speech.speechlet.SessionStartedRequest;
import com.amazon.speech.speechlet.Speechlet;
import com.amazon.speech.speechlet.SpeechletException;
import com.amazon.speech.speechlet.SpeechletResponse;
import com.amazon.speech.ui.PlainTextOutputSpeech;
import com.amazon.speech.ui.Reprompt;
import com.amazon.speech.ui.SimpleCard;
import com.macasaet.pianist.speechlet.activity.Activity;
import com.macasaet.pianist.speechlet.activity.NoteActivity;
import com.macasaet.pianist.speechlet.activity.WarmupActivity;

/**
 * {@link Speechlet} that that supports all of The Pianist's intents by delegating to various {@link Activity} objects.
 *
 * <p>Copyright &copy; 2016 Carlos Macasaet.</p>
 *
 * @author Carlos Macasaet
 */
public class PianistSpeechlet implements Speechlet {

    private static final String audioFileExtension = "mp3";

    private final Logger logger = LoggerFactory.getLogger(getClass());

    private final Activity warmupActivity;
    private final Activity noteActivity;

    public PianistSpeechlet(final String baseUrl) {
        this(new WarmupActivity(baseUrl, audioFileExtension),
                new NoteActivity(baseUrl, audioFileExtension));
    }

    protected PianistSpeechlet(final Activity warmupActivity, final Activity noteActivity) {
        if (warmupActivity == null) {
            throw new IllegalArgumentException("warmupActivity cannot be null");
        }
        if (noteActivity == null) {
            throw new IllegalArgumentException("noteActivity cannot be null");
        }

        this.warmupActivity = warmupActivity;
        this.noteActivity = noteActivity;
    }

    public void onSessionStarted(final SessionStartedRequest request, final Session session) throws SpeechletException {
    }

    public SpeechletResponse onLaunch(final LaunchRequest request, final Session session) throws SpeechletException {
        return genPlaintextAskResponse("Hello. How may I help you rehearse?");
    }

    public SpeechletResponse onIntent(final IntentRequest request, final Session session) throws SpeechletException {
        final Intent intent = request.getIntent();
        if (getWarmupActivity().supports(intent, session)) {
            return getWarmupActivity().handleRequest(request, session);
        }
        if (getNoteActivity().supports(intent, session)) {
            return getNoteActivity().handleRequest(request, session);
        } else if (HELP_INTENT.matches(intent)) {
            final PlainTextOutputSpeech outputSpeech = new PlainTextOutputSpeech();
            final String text = "I'm here to help you rehearse. Do you need a pitch? You can say, \"Give me an A.\" Do you want to warm up? You can say, \"Help me warm up.\"";
            outputSpeech.setText(text);
            final Reprompt reprompt = new Reprompt();
            reprompt.setOutputSpeech(outputSpeech);
            final SimpleCard card = new SimpleCard();
            card.setTitle("Rehearsing with The Pianist");
            card.setContent("Here are some things you can try.\n"
                    + "To tune your instrument: \"Alexa, tell The Pianist to give me an A.\"\n"
                    + "To warm up your vocals: \"Alexa, ask The Pianist to help me warm up.\"\n"
                    + "To get your first note: \"Alexa, tell The Pianist I need a G.\"");
            return newAskResponse(outputSpeech, reprompt, card);
        } else if (BAROQUE_MUSIC_INTENT.matches(intent)) {
            return genPlaintextTellResponse("If it's not baroque... Don't fix it.");
        }
        logger.info("Unrecognised intention: {}, {}", intent.getName(), ACTIVITY.getString(session));
        return genPlaintextAskResponse(
                "I'm sorry I don't follow. If you would like a note say, try saying \"Give me an A\" or \"Help me warm up\".");
    }

    public void onSessionEnded(final SessionEndedRequest request, final Session session) throws SpeechletException {
    }

    protected SpeechletResponse genPlaintextAskResponse(final String text) {
        final PlainTextOutputSpeech outputSpeech = new PlainTextOutputSpeech();
        outputSpeech.setText(text);
        final Reprompt reprompt = new Reprompt();
        reprompt.setOutputSpeech(outputSpeech);
        return newAskResponse(outputSpeech, reprompt);
    }

    protected SpeechletResponse genPlaintextTellResponse(final String text) {
        final PlainTextOutputSpeech outputSpeech = new PlainTextOutputSpeech();
        outputSpeech.setText(text);
        return newTellResponse(outputSpeech);
    }

    protected Activity getWarmupActivity() {
        return warmupActivity;
    }

    protected Activity getNoteActivity() {
        return noteActivity;
    }

}

package com.macasaet.pianist.domain;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * Enumeration of the notes supported by a standard piano.
 *
 * <p>Copyright &copy; 2016 Carlos Macasaet.</p>
 *
 * @author Carlos Macasaet
 */
public enum Note implements UrlReference {

    C("c", "c", "b sharp"),
    C_SHARP_D_FLAT("c-sharp-d-flat", "c sharp", "d flat"),
    D("d", "d"),
    D_SHARP_E_FLAT("d-sharp-e-flat", "d sharp", "e flat"),
    E("e", "e", "f flat"),
    F("f", "f", "e sharp"),
    F_SHARP_G_FLAT("f-sharp-g-flat", "f sharp", "g flat"),
    G("g", "g"),
    G_SHARP_A_FLAT("g-sharp-a-flat", "g sharp", "a flat"),
    A("a", "a"),
    A_SHARP_B_FLAT("a-sharp-b-flat", "a sharp", "b flat"),
    B("b", "b", "c flat");

    private static final Logger logger = LoggerFactory.getLogger(Note.class);
    private final String fileBaseName;
    private final String[] noteNames;

    /**
     * @param fileName the base name of the audio file for this note
     * @param noteNames the valid names for this note
     */
    private Note(final String fileName, final String... noteNames) {
        if (fileName == null || "".equals(fileName.trim())) {
            throw new IllegalArgumentException("fileName must be specified");
        }
        if (noteNames == null) {
            throw new IllegalArgumentException("noteNames cannot be null");
        }
        this.fileBaseName = fileName;
        this.noteNames = noteNames;
    }

    public String genUrl(final String baseUrl, final String fileExtension) {
        if (this == A) {
            // special case
            // typically a D Minor chord is desired when tuning to an "A"
            return baseUrl + "/afd." + fileExtension;
        }
        return baseUrl + "/" + getFileBaseName() + "." + fileExtension;
    }

    protected String getFileBaseName() {
        return fileBaseName;
    }

    protected String[] getNoteNames() {
        return noteNames;
    }

    /**
     * @param name pitch name including accidental suffix
     * @return the Note for a given name or null if an invalid note is requested
     */
    public static UrlReference forName(final String name) {
        for (final Note candidate : values()) {
            for (final String candidateName : candidate.getNoteNames()) {
                if (candidateName.equalsIgnoreCase(name)) {
                    return candidate;
                }
            }
        }
        logger.error("Unrecognised note name: " + name);
        return null;
    }

}

package com.macasaet.pianist.speechlet.activity;

import static com.macasaet.pianist.speechlet.Intents.PLAY_NOTE_INTENT;
import static com.macasaet.pianist.speechlet.Slots.NOTE_SLOT;
import static org.apache.commons.lang3.StringUtils.containsIgnoreCase;
import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertTrue;

import org.junit.After;
import org.junit.Before;
import org.junit.Test;

import com.amazon.speech.slu.Intent;
import com.amazon.speech.slu.Slot;
import com.amazon.speech.speechlet.IntentRequest;
import com.amazon.speech.speechlet.Session;
import com.amazon.speech.speechlet.SpeechletException;
import com.amazon.speech.speechlet.SpeechletResponse;
import com.amazon.speech.ui.SsmlOutputSpeech;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.google.common.collect.ImmutableMap;

/**
 * Test class for {@link NoteActivity}.
 *
 * <p>Copyright &copy; 2016 Carlos Macasaet.</p>
 *
 * @author Carlos Macasaet
 */
public class NoteActivityTest {

    private BaseActivity activity;

    @Before
    public void setUp() throws Exception {
        activity = new NoteActivity("https://audio.example.com/pianist", "ogg");
    }

    @After
    public void tearDown() throws Exception {
    }

    @Test
    public final void verifySupportsHandlesValidIntent() {
        // given
        final Session session = Session.builder().withSessionId("sessionId").build();
        final Intent intent = Intent.builder().withName("PlayNoteIntent").build();

        // when
        final boolean result = activity.supports(intent, session);

        // then
        assertTrue(result);
    }

    @Test
    public final void verifySupportHandlesInvalidIntent() {
        // given
        final Session session = Session.builder().withSessionId("sessionId").build();
        final Intent intent = Intent.builder().withName("InvalidIntent").build();

        // when
        final boolean result = activity.supports(intent, session);

        // then
        assertFalse(result);
    }

    @Test
    public final void verifyOnIntentPlaysNote() throws SpeechletException {
        // given
        final String noteSlotName = NOTE_SLOT.getSlotName();
        final Slot slot = Slot.builder().withName(noteSlotName).withValue("e sharp").build();
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).withSlots(ImmutableMap.of(noteSlotName, slot)).build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        assertTrue(result.getShouldEndSession());
        assertTrue(result.getOutputSpeech() instanceof SsmlOutputSpeech);
    }

    @Test
    public final void verifyOnIntentCorrectlyRefersToA() throws SpeechletException {
        // given
        final String noteSlotName = NOTE_SLOT.getSlotName();
        final Slot slot = Slot.builder().withName(noteSlotName).withValue("a").build();
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).withSlots(ImmutableMap.of(noteSlotName, slot)).build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        assertTrue(result.getShouldEndSession());
        final SsmlOutputSpeech outputSpeech = ( SsmlOutputSpeech )result.getOutputSpeech();
        final String ssml = outputSpeech.getSsml();
        assertTrue(containsIgnoreCase(ssml, " an a"));
    }

    @Test
    public final void verifyOnIntentCorrectlyRefersToF() throws SpeechletException {
        // given
        final String noteSlotName = NOTE_SLOT.getSlotName();
        final Slot slot = Slot.builder().withName(noteSlotName).withValue("f").build();
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).withSlots(ImmutableMap.of(noteSlotName, slot)).build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        assertTrue(result.getShouldEndSession());
        final SsmlOutputSpeech outputSpeech = ( SsmlOutputSpeech )result.getOutputSpeech();
        final String ssml = outputSpeech.getSsml();
        assertTrue(containsIgnoreCase(ssml, " an f"));
    }

    @Test
    public final void verifyOnIntentCorrectlyRefersToD() throws SpeechletException {
        // given
        final String noteSlotName = NOTE_SLOT.getSlotName();
        final Slot slot = Slot.builder().withName(noteSlotName).withValue("d").build();
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).withSlots(ImmutableMap.of(noteSlotName, slot)).build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        assertTrue(result.getShouldEndSession());
        final SsmlOutputSpeech outputSpeech = ( SsmlOutputSpeech )result.getOutputSpeech();
        final String ssml = outputSpeech.getSsml();
        assertTrue(containsIgnoreCase(ssml, " a d"));
    }

    @Test
    public final void verifyOnIntentCorrectlyScrubsA() throws SpeechletException, JsonProcessingException {
        // given
        final String noteSlotName = NOTE_SLOT.getSlotName();
        final Slot slot = Slot.builder().withName(noteSlotName).withValue("a g").build();
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).withSlots(ImmutableMap.of(noteSlotName, slot)).build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        assertTrue(result.getShouldEndSession());
        final SsmlOutputSpeech outputSpeech = ( SsmlOutputSpeech )result.getOutputSpeech();
        final String ssml = outputSpeech.getSsml();
        assertTrue(containsIgnoreCase(ssml, " a g"));
    }

    @Test
    public final void verifyOnIntentCorrectlyScrubsAFromABFlat() throws SpeechletException, JsonProcessingException {
        // given
        final String noteSlotName = NOTE_SLOT.getSlotName();
        final Slot slot = Slot.builder().withName(noteSlotName).withValue("a b flat").build();
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).withSlots(ImmutableMap.of(noteSlotName, slot)).build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        assertTrue(result.getShouldEndSession());
        final SsmlOutputSpeech outputSpeech = ( SsmlOutputSpeech )result.getOutputSpeech();
        final String ssml = outputSpeech.getSsml();
        assertTrue(containsIgnoreCase(ssml, " a b flat"));
    }

    @Test
    public final void verifyOnIntentCorrectlyScrubsAn() throws SpeechletException, JsonProcessingException {
        // given
        final String noteSlotName = NOTE_SLOT.getSlotName();
        final Slot slot = Slot.builder().withName(noteSlotName).withValue("an e").build();
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).withSlots(ImmutableMap.of(noteSlotName, slot)).build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        System.out.println( "-- result: " + new ObjectMapper().writeValueAsString( result ) );
        assertTrue(result.getShouldEndSession());
        final SsmlOutputSpeech outputSpeech = ( SsmlOutputSpeech )result.getOutputSpeech();
        final String ssml = outputSpeech.getSsml();
        assertTrue(containsIgnoreCase(ssml, " an e"));
    }

    @Test
    public final void verifyOnIntentDoesNotScrubAFromAFlat() throws SpeechletException, JsonProcessingException {
        // given
        final String noteSlotName = NOTE_SLOT.getSlotName();
        final Slot slot = Slot.builder().withName(noteSlotName).withValue("a flat").build();
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).withSlots(ImmutableMap.of(noteSlotName, slot)).build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        System.out.println( "-- result: " + new ObjectMapper().writeValueAsString( result ) );
        assertTrue(result.getShouldEndSession());
        final SsmlOutputSpeech outputSpeech = ( SsmlOutputSpeech )result.getOutputSpeech();
        final String ssml = outputSpeech.getSsml();
        assertTrue(containsIgnoreCase(ssml, " a flat"));
    }

    @Test
    public final void verifyOnIntentScrubsPeriod() throws SpeechletException {
        // given
        final String noteSlotName = NOTE_SLOT.getSlotName();
        final Slot slot = Slot.builder().withName(noteSlotName).withValue("B.").build();
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).withSlots(ImmutableMap.of(noteSlotName, slot))
                .build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        assertTrue(result.getShouldEndSession());
        final SsmlOutputSpeech outputSpeech = (SsmlOutputSpeech) result.getOutputSpeech();
        final String ssml = outputSpeech.getSsml();
        assertTrue(containsIgnoreCase(ssml, " a b"));
    }

    @Test
    public final void verifyOnIntentScrubsExclamationPoint() throws SpeechletException {
        // given
        final String noteSlotName = NOTE_SLOT.getSlotName();
        final Slot slot = Slot.builder().withName(noteSlotName).withValue("G!").build();
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).withSlots(ImmutableMap.of(noteSlotName, slot))
                .build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        assertTrue(result.getShouldEndSession());
        final SsmlOutputSpeech outputSpeech = (SsmlOutputSpeech) result.getOutputSpeech();
        final String ssml = outputSpeech.getSsml();
        assertTrue(containsIgnoreCase(ssml, " a g"));
    }

    @Test
    public final void verifyOnIntentHandlesUnspecifiedNote() throws SpeechletException {
        // given
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        assertFalse(result.getShouldEndSession());
    }

    @Test
    public final void verifyOnIntentHandlesInvalidNote() throws SpeechletException {
        // given
        final String noteSlotName = NOTE_SLOT.getSlotName();
        final Slot slot = Slot.builder().withName(noteSlotName).withValue("H Sharp").build();
        final Intent intent = Intent.builder().withName(PLAY_NOTE_INTENT.getIntentName()).withSlots(ImmutableMap.of(noteSlotName, slot))
                .build();
        final IntentRequest request = IntentRequest.builder().withIntent(intent).withRequestId("requestId").build();
        final Session session = Session.builder().withSessionId("sessionId").build();

        // when
        final SpeechletResponse result = activity.handleRequest(request, session);

        // then
        assertFalse(result.getShouldEndSession());
    }

}

package com.macasaet.pianist.speechlet.activity;

import static com.macasaet.pianist.speechlet.Intents.JUST_AN_A_INTENT;
import static com.macasaet.pianist.speechlet.Intents.PLAY_NOTE_INTENT;
import static com.macasaet.pianist.speechlet.Slots.NOTE_SLOT;
import static java.util.regex.Pattern.compile;
import static org.apache.commons.lang3.StringUtils.isBlank;

import java.util.regex.Pattern;

import com.amazon.speech.slu.Intent;
import com.amazon.speech.speechlet.IntentRequest;
import com.amazon.speech.speechlet.Session;
import com.amazon.speech.speechlet.SpeechletResponse;
import com.macasaet.pianist.domain.Note;
import com.macasaet.pianist.domain.UrlReference;

/**
 * {@link Activity} that plays a note requested by the musician.
 *
 * <p>Copyright &copy; Carlos Macasaet.</p>
 *
 * @author Carlos Macasaet
 */
public class NoteActivity extends BaseActivity {

    private static final Pattern charsToRemoveFromNoteName = compile("[^A-Za-z0-9 ]");
    private static final Pattern whitespacePattern = compile("\\s");

    public NoteActivity(final String baseUrl, final String audioFileExtension) {
        super(baseUrl, audioFileExtension);
    }

    public boolean supports(final Intent intent, final Session session) {
        return PLAY_NOTE_INTENT.matches(intent) || JUST_AN_A_INTENT.matches(intent);
    }

    public SpeechletResponse handleRequest(final IntentRequest request, final Session session) {
        final Intent intent = request.getIntent();
        if (JUST_AN_A_INTENT.matches(intent)) {
            return genSsmlTellResponse("<speak>Here is an A: <audio src=\"" + getBaseUrl() + "/a.mp3\" /></speak>");
        }

        final String slotValue = NOTE_SLOT.getStringValue(intent);
        if (!isBlank(slotValue)) {
            String scrubbedSlotValue = getCharsToRemoveFromNoteName().matcher(slotValue).replaceAll("")
                    .toLowerCase();
            final String[] scrubbedValueComponents = getWhitespacePattern().split(scrubbedSlotValue);
            if (scrubbedValueComponents.length == 3
                    && ("sharp".equalsIgnoreCase(scrubbedValueComponents[2])
                            || "flat".equalsIgnoreCase(scrubbedValueComponents[2]))
                    && ("a".equalsIgnoreCase(scrubbedValueComponents[0])
                            || "an".equalsIgnoreCase(scrubbedValueComponents[0]))) {
                scrubbedSlotValue = scrubbedValueComponents[1] + " " + scrubbedValueComponents[2];
            } else if (scrubbedValueComponents.length == 2 && !"sharp".equalsIgnoreCase(scrubbedValueComponents[1])
                    && !"flat".equalsIgnoreCase(scrubbedValueComponents[1])
                    && ("a".equalsIgnoreCase(scrubbedValueComponents[0])
                            || "an".equalsIgnoreCase(scrubbedValueComponents[0]))) {
                scrubbedSlotValue = scrubbedValueComponents[1];
            }
            final UrlReference note = Note.forName(scrubbedSlotValue);
            if (note != null) {
                final String url = genUrl(note);
                final String slotArticle = scrubbedSlotValue.startsWith("a")
                        || scrubbedSlotValue.startsWith("e")
                        || scrubbedSlotValue.startsWith("f") ? "an" : "a";
                return genSsmlTellResponse("<speak>Here is " + slotArticle + " " + scrubbedSlotValue
                        + ": <audio src=\"" + url + "\" /></speak>");
            }
            return genPlaintextAskResponse("I'm sorry, my piano doesn't have that key. Please ask for a note between A and G sharp.");
        }
        return genPlaintextAskResponse("Please specify the note you would like me to play. For example, you can say, \"I need an A.\"");
    }

    /**
     * Sometimes ASK inserts punctuation or other characters. This regular expression makes it easy to remove those.
     *
     * @return the pre-compiled pattern of invalid characters
     */
    protected Pattern getCharsToRemoveFromNoteName() {
        return charsToRemoveFromNoteName;
    }

    protected Pattern getWhitespacePattern() {
        return whitespacePattern;
    }

}

Credits

Estella Ramirez

1 project • 2 followers

Contact

Carlos Macasaet

1 project • 2 followers

Contact

Comments

Please log in or sign up to comment.

Awards

Third Place

Hey Alexa! The Amazon Alexa Skill Contest

The Pianist

Things used in this project

Hardware components

Software apps and online services

Story

Lean Approach

Record Sound Clip

Format Sound Clip

Upload to web host

Challenges

Pronunciation

Unexpected Slot Values

Cold Starts

Schematics

Interaction Diagram

Class Diagram

Code

PianistSpeechlet.java

Note.java

NoteActivityTest.java

NoteActivity.java

Credits

Estella Ramirez

Carlos Macasaet

Comments

Awards

Embed the widget on your own site

The Pianist

The Pianist

Things used in this project

Hardware components

Software apps and online services

Story

Lean Approach

Record Sound Clip

Format Sound Clip

Upload to web host

Challenges

Pronunciation

Unexpected Slot Values

Cold Starts

Schematics

Interaction Diagram

Class Diagram

Code

PianistSpeechlet.java

Note.java

NoteActivityTest.java

NoteActivity.java

Credits

Estella Ramirez

Carlos Macasaet

Comments

Awards

Related channels and tags