Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(compute): Add pg_duckdb extension v0.2.0 #10350

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open
42 changes: 40 additions & 2 deletions compute/compute-node.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,11 @@ RUN case $DEBIAN_VERSION in \
echo "Unknown Debian version ${DEBIAN_VERSION}" && exit 1 \
;; \
esac && \
apt update && \
apt update && \
apt install --no-install-recommends --no-install-suggests -y \
ninja-build git autoconf automake libtool build-essential bison flex libreadline-dev \
zlib1g-dev libxml2-dev libcurl4-openssl-dev libossp-uuid-dev wget ca-certificates pkg-config libssl-dev \
libicu-dev libxslt1-dev liblz4-dev libzstd-dev zstd \
libicu-dev libxslt1-dev liblz4-dev libzstd-dev zstd g++ \
$VERSION_INSTALLS \
&& apt clean && rm -rf /var/lib/apt/lists/*

Expand Down Expand Up @@ -1107,6 +1107,43 @@ RUN wget https://github.com/Mooncake-Labs/pg_mooncake/releases/download/v0.1.0/p
make install -j $(getconf _NPROCESSORS_ONLN) && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/pg_mooncake.control

#########################################################################################
#
# Layer "pg-duckdb-pg-build"
# compile pg_duckdb extension
#
#########################################################################################

FROM build-deps AS pg-duckdb-pg-build
ARG PG_VERSION
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY compute/patches/duckdb-v1-1-3.patch /duckdb-v1-1-3.patch

ENV PATH="/usr/local/pgsql/bin/:$PATH"


# pg_duckdb build requires source dir to be a git repo to get submodules
# allow neon_superuser to execute some functions that in pg_duckdb are available to superuser only
# cache management functions duckdb.cache(), duckdb.cache_info(), duckdb.cache_delete()
# extension management function duckdb.install_extension()
# for debugging purposes raw query and reset ddb duckdb.raw_query(), duckdb.recycle_ddb()
RUN git clone --depth 1 --branch v0.2.0 https://github.com/duckdb/pg_duckdb.git pg_duckdb-src && \
cd pg_duckdb-src && \
git submodule update --init --recursive && \
cd third_party/duckdb && \
patch -p1 < /duckdb-v1-1-3.patch && \
cd ../.. && \
make install -j $(getconf _NPROCESSORS_ONLN) && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/pg_duckdb.control && \
file=/usr/local/pgsql/share/extension/pg_duckdb--0.1.0--0.2.0.sql && \
echo 'GRANT ALL ON FUNCTION duckdb.cache(TEXT, TEXT) TO neon_superuser;' >> $file && \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, is it better to keep these grants as a patch for https://github.com/duckdb/pg_duckdb/blob/v0.2.0/sql/pg_duckdb--0.1.0--0.2.0.sql to have everything in one place?

Either way, is there anything else that blocks us from adding this into image? @Bodobolero (thanks for your fixes)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you set disabled_filesystems=‘LocalFileSystem’ which is necessary to fix the security hole S3 access still doesn’t work.
I will see if I find time today to continue working on the patch.

echo 'GRANT ALL ON FUNCTION duckdb.cache_info() TO neon_superuser;' >> $file && \
echo 'GRANT ALL ON FUNCTION duckdb.cache_delete(cache_key TEXT) TO neon_superuser;' >> $file && \
echo 'GRANT ALL ON FUNCTION duckdb.install_extension(TEXT) TO neon_superuser;' >> $file && \
echo 'GRANT ALL ON FUNCTION duckdb.raw_query(TEXT) TO neon_superuser;' >> $file && \
echo 'GRANT ALL ON PROCEDURE duckdb.recycle_ddb() TO neon_superuser;' >> $file


#########################################################################################
#
# Layer "pg_repack"
Expand Down Expand Up @@ -1169,6 +1206,7 @@ COPY --from=pg-ivm-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-partman-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-mooncake-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-repack-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-duckdb-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY pgxn/ pgxn/

RUN make -j $(getconf _NPROCESSORS_ONLN) \
Expand Down
97 changes: 97 additions & 0 deletions compute/patches/duckdb-v1-1-3.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
diff --git a/src/common/virtual_file_system.cpp b/src/common/virtual_file_system.cpp
index 74892a4e05..60e9e7af88 100644
--- a/src/common/virtual_file_system.cpp
+++ b/src/common/virtual_file_system.cpp
@@ -173,7 +173,9 @@ void VirtualFileSystem::SetDisabledFileSystems(const vector<string> &names) {

FileSystem &VirtualFileSystem::FindFileSystem(const string &path) {
auto &fs = FindFileSystemInternal(path);
- if (!disabled_file_systems.empty() && disabled_file_systems.find(fs.GetName()) != disabled_file_systems.end()) {
+ // we allow LocalFileSystem access to duckdb itself (by specifiying trustedContext=true)
+ // but not to duckdb users if disabled_file_systems='LocalFileSystem'
+ if (!isTrustedContext && !disabled_file_systems.empty() && disabled_file_systems.find(fs.GetName()) != disabled_file_systems.end()) {
throw PermissionException("File system %s has been disabled by configuration", fs.GetName());
}
return fs;
diff --git a/src/include/duckdb/common/file_system.hpp b/src/include/duckdb/common/file_system.hpp
index 0b83c4f393..58d70889cf 100644
--- a/src/include/duckdb/common/file_system.hpp
+++ b/src/include/duckdb/common/file_system.hpp
@@ -256,6 +256,30 @@ public:
DynamicCastCheck<TARGET>(this);
return reinterpret_cast<const TARGET &>(*this);
}
+public:
+ DUCKDB_API virtual void setTrusted(){
+ this->isTrustedContext = true;
+ }
+ DUCKDB_API virtual void setUntrusted(){
+ this->isTrustedContext = false;
+ }
+
+public:
+ class TrustedContext {
+ public:
+ TrustedContext(FileSystem &fs) : fileSystem(fs) {
+ fileSystem.setTrusted();
+ }
+ ~TrustedContext() {
+ fileSystem.setUntrusted();
+ }
+
+ private:
+ FileSystem &fileSystem;
+ };
+
+protected:
+ bool isTrustedContext = false;
};

} // namespace duckdb
diff --git a/src/include/duckdb/common/opener_file_system.hpp b/src/include/duckdb/common/opener_file_system.hpp
index 2d35512b21..d1597173ee 100644
--- a/src/include/duckdb/common/opener_file_system.hpp
+++ b/src/include/duckdb/common/opener_file_system.hpp
@@ -143,6 +143,18 @@ public:
vector<string> ListSubSystems() override {
return GetFileSystem().ListSubSystems();
}
+
+private:
+
+ virtual void setTrusted() override {
+ this->isTrustedContext = true;
+ GetFileSystem().setTrusted();
+ }
+
+ virtual void setUntrusted() override {
+ this->isTrustedContext = false;
+ GetFileSystem().setUntrusted();
+ }
};

} // namespace duckdb
diff --git a/src/main/extension/extension_install.cpp b/src/main/extension/extension_install.cpp
index 1258d95ead..fc9ce1c77d 100644
--- a/src/main/extension/extension_install.cpp
+++ b/src/main/extension/extension_install.cpp
@@ -57,6 +57,7 @@ const vector<string> ExtensionHelper::PathComponents() {
}

duckdb::string ExtensionHelper::DefaultExtensionFolder(FileSystem &fs) {
+ FileSystem::TrustedContext trusted(fs);
string home_directory = fs.GetHomeDirectory();
// exception if the home directory does not exist, don't create whatever we think is home
if (!fs.DirectoryExists(home_directory)) {
diff --git a/src/main/extension/extension_load.cpp b/src/main/extension/extension_load.cpp
index b0282a7103..c2765f97c6 100644
--- a/src/main/extension/extension_load.cpp
+++ b/src/main/extension/extension_load.cpp
@@ -293,6 +293,7 @@ bool ExtensionHelper::TryInitialLoad(DatabaseInstance &db, FileSystem &fs, const
if (!db.config.options.enable_external_access) {
throw PermissionException("Loading external extensions is disabled through configuration");
}
+ FileSystem::TrustedContext trusted(fs);
auto filename = fs.ConvertSeparators(extension);

bool direct_load;
Loading