amazon s3 - AWS lambda tar file extraction doesn't seem to work

Question

Welcome To Ask or Share your Answers For Others

amazon s3 - AWS lambda tar file extraction doesn't seem to work

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

amazon s3 - AWS lambda tar file extraction doesn't seem to work

I'm trying to run serverless LibreOffice based on this tutorial. Here is the full python lambda function:

import boto3
import os

s3_bucket = boto3.resource("s3").Bucket("lambda-libreoffice-demo")
os.system("curl https://s3.amazonaws.com/lambda-libreoffice-demo/lo.tar.gz -o /tmp/lo.tar.gz && cd /tmp && tar -xf /tmp/lo.tar.gz")
convertCommand = "instdir/program/soffice --headless --invisible --nodefault --nofirststartwizard --nolockcheck --nologo --norestore --convert-to pdf --outdir /tmp"

def lambda_handler(event,context):
  inputFileName = event['filename']
  # Put object wants to be converted in s3
  with open(f'/tmp/{inputFileName}', 'wb') as data:
      s3_bucket.download_fileobj(inputFileName, data)

  # Execute libreoffice to convert input file
  os.system(f"cd /tmp && {convertCommand} {inputFileName}")

  # Save converted object in S3
  outputFileName, _ = os.path.splitext(inputFileName)
  outputFileName = outputFileName  + ".pdf"
  f = open(f"/tmp/{outputFileName}","rb")
  s3_bucket.put_object(Key=outputFileName,Body=f,ACL="public-read")
  f.close()

The response when running the full scripts is:
"errorMessage": "ENOENT: no such file or directory, open '/tmp/example.pdf'",

So I began to debug it row by row.
Based on my debug prints, it seems that it fails right on the start, when trying to extract the binary on the second row:

os.path.exists('/tmp/lo.tar.gz') // => true
os.path.exists('/tmp/instdir/program/soffice.bin') // => false

So it looks like the tar is the problematic part there. If I download the file from S3 and run the tar command locally it seems to extract the file just fine.

Tried with node, python 3.8, python 3.6. Also tried it with and without the layer (and the /opt/lo.tar.br path) as described here.

question from:https://stackoverflow.com/questions/65884502/aws-lambda-tar-file-extraction-doesnt-seem-to-work

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:20:35+0000

I ran into the same issue.

I suspect the problem is a permissions error executing files in /tmp.

Try copying instdir/ to your home folder & running it out of there.

Please write back to confirm if you test this!

I ended up creating a Docker container which installs LibreOffice properly, e.g.:

# Use Amazon Linux 2 (It's based on CentOS) as base image
FROM amazon/aws-lambda-provided:al2

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Download and install LibreOffice (and deps)

RUN yum update -y 
    && yum clean all 
    && yum install -y wget tar gzip

RUN cd /tmp 
    && wget http://download.documentfoundation.org/libreoffice/stable/7.0.4/rpm/x86_64/LibreOffice_7.0.4_Linux_x86-64_rpm.tar.gz 
    && tar -xvf LibreOffice_7.0.4_Linux_x86-64_rpm.tar.gz

# For some reason we need to "clean all"
RUN cd /tmp/LibreOffice_7.0.4.2_Linux_x86-64_rpm/RPMS 
    && yum clean all 
    && yum -y localinstall *.rpm 

# Required deps for soffice
RUN yum -y install 
    fontconfig libXinerama.x86_64 cups-libs dbus-glib cairo libXext libSM libXrender

# NOTE: Should we install libreoffice-writer? (doesn't seem to be required)

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

# We need to read/write to S3 bucket
RUN yum -y install 
    awscli 
    jq

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

# We test with this file
COPY test-template.docx /home/test-template.docx

# This code derives from Ari's original article
COPY process_doc.sh     /home/process_doc.sh
COPY bootstrap          /var/runtime/bootstrap
COPY function.sh        /var/task/function.sh

RUN chmod u+rx 
    /home/process_doc.sh 
    /var/runtime/bootstrap 
    /var/task/function.sh

CMD [ "function.sh.handler" ]
# ^ Why CMD not ENTRYPOINT

... and running a containerized lambda: https://github.com/p-i-/lambda-container-image-with-custom-runtime-example

Categories

amazon s3 - AWS lambda tar file extraction doesn't seem to work

amazon s3 - AWS lambda tar file extraction doesn't seem to work

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags