Dec 11, 2024

Amir Houieh

AI Hack Night Challenge - Matching Problem

Hello [new] world!

At Unbody, we’re building tools for AI-native products—apps where AI isn’t just bolted on but is the foundation. Forget the duct-taped systems of yesterday. Unbody lets developers build smarter, unified, and scalable AI-native platforms.

The Challenge: Solve the “Matching” Problem

Your challenge is to create a platform that tackles the matching problem. Whether it’s connecting users to therapists, matching event attendees with investors, or even building a dating app—your goal is to build a solution that intelligently matches one group to another.

You can:

  • Build a generic matching framework that can adapt to various use cases.

  • Focus on a niche scenario like matchmaking in tech events, health care, or personal connections.

Think outside the box, but keep it functional and usable!

What You’ll Need (and How Unbody Helps)

for sample code and technical docs go to bottom of the page please

To build your solution, you’ll need to handle:

  1. A RAG Pipeline: For retrieving and generating relevant results.

  2. A Query Parser: To process and structure user input into actionable queries.

  3. Data Enrichment: To prepare and enhance raw data for better matching results.

  4. (Optional) Multimedia Hosting: If you’re working with images or videos.

Here’s how Unbody helps with each part:

  • Data Ingestion: Import data from platforms like Google Drive or Discord using pre-built connectors. Custom schemas are also available but optional for this challenge.

  • Data Processing and Enhancements: Configure how your data is parsed, cleaned, enriched, and indexed. Use pre-built enhancers like summarization or keyword extraction, or define your own.

  • GraphQL API/TypeScript SDK: Access your processed data through a single interface for querying and integration.

  • Backend Management: Unbody handles chunking, vectorization, indexing, and hosting automatically.

Examples of Spark Ideas

Here are some inspiring examples of what you could build:

  • Tech Event Matchmaker: Match startup founders with potential investors based on industry focus, investment stage, and mutual interests.

  • Skill Exchange Platform: Connect people wanting to learn specific skills with those who can teach them, using expertise matching and scheduling compatibility.

  • Next-gen Dating App: Match potential partners based on deeper compatibility factors like shared values, life goals, communication styles, or even pictures and appearance using AI to analyze written responses and conversation patterns for more meaningful connections.

Criteria for Success

The best solutions will:

  1. Be Functional and Scalable: Build something that solves a real-world matching problem.

  2. Utilize Unbody’s APIs: Show how you leveraged our stack to simplify and enhance your development.

  3. Focus on User Experience: Make sure the solution is intuitive and practical for end-users.

Rewards

  • A premium gadget.

  • 100 minutes of Unbody build-time credit.

  • Recognition and community respect.

Support During the Hackathon

We’re here to help! Join our Discord channel to get real-time support from our team. Whether it’s about Unbody APIs, debugging, or architecture, we’ve got you covered.


—-

How It Works: A Closer Look

Here’s how Unbody helps you build your matching solution step-by-step:

1. Ingest Your Data

  • Third-Party Connectors: Quickly pull data from platforms like Google Drive or Discord.

  • Custom Schemas: Optional for advanced cases; define how your data is structured.
    Docs link.

    import {
      AutoEntities,
      AutoSummary,
      AutoTopics,
      AutoVision,
      CustomSchema,
      Generative,
      ProjectSettings,
      TextVectorizer,
      UnbodyAdmin,
      AutoKeywords,
      PdfParser,
      QnA,
    } from 'unbody/admin'
    
    const settings = new ProjectSettings()
    
    // set up vectorizer
    settings.set(new TextVectorizer(TextVectorizer.OpenAI.TextEmbedding3Small))
    
    // set up Generative modules
    settings.set(new QnA(QnA.OpenAI.GPT3_5TurboInstruct))
    settings.set(new Generative(Generative.OpenAI.GPT4o))
    
    // set up auto-enhancement modules
    settings.set(new AutoSummary(AutoSummary.OpenAI.GPT4o))
    settings.set(new AutoKeywords(AutoKeywords.OpenAI.GPT4oMini))
    settings.set(new AutoTopics(AutoTopics.OpenAI.GPT4oMini))
    settings.set(new AutoEntities(AutoEntities.OpenAI.GPT4oMini))
    settings.set(new AutoVision(AutoVision.OpenAI.GPT4o))
    
    // configure PDF parser
    settings.set(new PdfParser(PdfParser.Pdf2Image.Default))
    
    // extend ImageBlock collection
    const customSchema = new CustomSchema()
    customSchema.extend(
      new CustomSchema.Collection('ImageBlock').add(
        new CustomSchema.Field.Boolean('xFacialVisibility', 'Facial Visibility'),
        new CustomSchema.Field.Number(
          'xEstimatedAgeRange',
          'Estimated Age Range',
          true,
        ),
      ),
    )
    
    // extend TextDocument collection
    customSchema.extend(
      new CustomSchema.Collection('TextDocument').add(
        new CustomSchema.Field.Text('xQnA', 'Generated QnAs', true),
      ),
    )
    
    settings.set(customSchema)

2. Process and Enhance

  • Parsing: Select parsers based on your data format (e.g., PDFs, text documents).

  • Enhancers: Use built-in tools like summarization, keyword extraction, OCR, or image captioning to improve your data. You can also create custom enhancers if needed.
    Enhancer setup guide.

    import { UnbodyAdmin, Enhancement } from 'unbody/admin'
    
    const enhancement = new Enhancement()
    
    const imagePipeline = new Enhancement.Pipeline(
      'enrich_image_block',
      'ImageBlock',
    )
    
    imagePipeline.add(
      new Enhancement.Step(
        'extract_metadata',
        new Enhancement.Action.StructuredGenerator({
          model: 'openai-gpt-4o',
          prompt: 'Extract metadata from the image',
          schema: (ctx, { z }) =>
            z.object({
              facialVisibility: z
                .boolean()
                .describe('Whether the face is visible in the image or not'),
              estimatedAgeRange: z
                .array(z.number().int())
                .describe(
                  "Estimated age range of the person in the image; if it's not a person, the range will be empty array",
                ),
            }),
          images: (ctx) => [{ url: ctx.record.url }],
        }),
        {
          if: (ctx) => ctx.record.url && ctx.record.url.startsWith('https://'),
    
          output: {
            xFacialVisibility: (ctx) => ctx.result.json.facialVisibility,
            xEstimatedAgeRange: (ctx) => ctx.result.json.estimatedAgeRange || [],
          },
        },
      ),
    )
    
    const markdownPipeline = new Enhancement.Pipeline(
      'enrich_markdown_file',
      'TextDocument',
      {
        if: (ctx) => ctx.record.mimeType === 'text/markdown',
      },
    )
    
    markdownPipeline.add(
      new Enhancement.Step(
        'generate_answers',
        new Enhancement.Action.StructuredGenerator({
          model: 'openai-gpt-4o',
          prompt: (
            ctx,
          ) => `Answer the following questions based on the content of the markdown file:
          - What is the main topic of the document?
          - Question 2
          - Question 3
    
          ---
          Document content:
          ${ctx.record.text}
          `,
          schema: (ctx, { z }) =>
            z.object({
              answers: z.array(
                z.object({
                  question: z.string(),
                  answer: z.string(),
                }),
              ),
            }),
        }),
        {
          output: {
            xQnA: (ctx) =>
              ctx.result.json.answers
                .map((a) => `${a.question}: ${a.answer}`),
          },
        },
      ),
    )
    
    enhancement.add(imagePipeline)
    enhancement.add(markdownPipeline)
    
    settings.set(enhancement)
  1. Create a project

const admin = new UnbodyAdmin({
	auth: {
		username: "[admin-key-id]",
		password: "[admin-key-secret]"
	}
})

const project = admin.projects.ref({ name: "New Project", settings })
await project.save()

const apiKey = await project.apiKeys.ref({ name: 'development' }).save()

console.log({
  projectId: project.id,
  apiKey: apiKey.key,
})


4. Interact with AI-Ready Data

  • GraphQL API: Query enriched data to build features like RAG pipelines or semantic search.

  • TypeScript SDK: Simplify queries with pre-built functions.
    SDK and API docs.

    import { Unbody } from 'unbody'
    
    const unbody = new Unbody({
      projectId: '[project-id]',
      apiKey: '[project-api-key]',
    })
    
    // semantic search
    const {
      data: { payload },
    } = await unbody.get.textDocument.search.about([...concepts]).exec()
    
    for(const record of payload) {
      console.log(record.title, record._additional?.distance)
    }
    
    // RAG example
    const {
      data: { generate },
    } = await unbody.get.textDocument
      .where({
        mimeType: 'text/markdown',
      })
      .search.about(['AI', 'AI-native', 'AI-enabled'])
      .limit(2)
      .generate.fromMany(
        `Answer the question based on provided text content. Rely solely on the content of the document to generate the answer.
       
        Question: ${userQuestion} 
        `,
        ['originalName', 'title', 'text'],
      )
      .exec()
    
    console.log(
      generate.result,
      generate.from.map((r) => r.originalName),
    )
    
    
    // Advanced RAG
    const {
      data: { generate },
    } = await unbody.get.textDocument
      .where({
        mimeType: 'text/markdown',
      })
      .search.about(['AI', 'AI-native', 'AI-enabled'])
      .limit(2)
      .generate.fromMany({
        messages: [
          {
            role: 'system',
            content: `Answer user questions based on the content of the provided documents
            
            DOCUMENTS:
            \`\`\`json
            {docs}
            \`\`\`
            `,
          },
          ...history,
          {
            role: 'user',
            content: userQuestion,
          },
        ],
        options: {
          vars: [
            {
              name: 'docs',
              expression: '. | map({ text, title, originalName })',
              formatter: 'jq',
            },
          ],
        },
      })
      
      
    // Structured Output
    import { z } from 'zod'
    import { zodToJsonSchema } from 'zod-to-json-schema'
    
    const {
      data: { payload },
    } = await unbody.generate.json(
      [
        {
          type: 'text',
          content: `Generate a caption and extract metadata from the provided image.`,
        },
        {
          type: 'image',
          content: {
            url: imageUrl,
          },
        },
      ],
      {
        model: 'gpt-4o',
        schema: zodToJsonSchema(
          z.object({
            caption: z.string().describe('Generated caption for the image'),
            objects: z
              .array(z.string())
              .describe('list of objects detected in the image'),
          }),
        ),
      },
    )
    
    console.log(JSON.stringify(payload, null, 2))

5. Build Your Solution

With your data processed, you can focus on:

  • Fetching relevant matches using RAG.

  • Ranking results for smarter recommendations.

  • Adding multimedia features or creating custom enhancements.

Automated Processing

Unbody handles the backend:

  • Cleans and chunks your data.

  • Vectorizes it for semantic search.

  • Indexes it in a database for fast querying.