Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go: database local source models #17905

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

egregius313
Copy link
Contributor

@egregius313 egregius313 commented Nov 4, 2024

Pull Request checklist

All query authors

Internal query authors only

  • Autofixes generated based on these changes are valid, only needed if this PR makes significant changes to .ql, .qll, or .qhelp files. See the documentation (internal access required).
  • Changes are validated at scale (internal access required).
  • Adding a new query? Consider also adding the query to autofix.

@github-actions github-actions bot added the Go label Nov 4, 2024
@egregius313 egregius313 changed the title Egregius313/go/dataflow/models/database Go: database local source models Nov 4, 2024
Copy link
Contributor

github-actions bot commented Nov 4, 2024

⚠️ The head of this PR and the base branch were compared for differences in the framework coverage reports. The generated reports are available in the artifacts of this workflow run. The differences will be picked up by the nightly job after the PR gets merged.

Click to show differences in coverage

go

Generated file changes for go

  • Changes to framework-coverage-go.rst:
-    `Couchbase official client(gocb) <https://github.com/couchbase/gocb>`_,"``github.com/couchbase/gocb*``, ``gopkg.in/couchbase/gocb*``",,36,16
+    `Couchbase official client(gocb) <https://github.com/couchbase/gocb>`_,"``github.com/couchbase/gocb*``, ``gopkg.in/couchbase/gocb*``",26,56,16
-    `Couchbase unofficial client <http://www.github.com/couchbase/go-couchbase>`_,``github.com/couchbaselabs/gocb*``,,18,8
+    `Couchbase unofficial client <http://www.github.com/couchbase/go-couchbase>`_,``github.com/couchbaselabs/gocb*``,13,28,8
-    Others,"``github.com/Masterminds/squirrel``, ``github.com/caarlos0/env``, ``github.com/go-gorm/gorm``, ``github.com/go-xorm/xorm``, ``github.com/gobuffalo/envy``, ``github.com/gogf/gf/database/gdb``, ``github.com/hashicorp/go-envparse``, ``github.com/jinzhu/gorm``, ``github.com/jmoiron/sqlx``, ``github.com/joho/godotenv``, ``github.com/kelseyhightower/envconfig``, ``github.com/lann/squirrel``, ``github.com/raindog308/gorqlite``, ``github.com/rqlite/gorqlite``, ``github.com/uptrace/bun``, ``go.mongodb.org/mongo-driver/mongo``, ``gopkg.in/Masterminds/squirrel``, ``gorm.io/gorm``, ``xorm.io/xorm``",117,16,391
+    Others,"``github.com/Masterminds/squirrel``, ``github.com/caarlos0/env``, ``github.com/go-gorm/gorm``, ``github.com/go-xorm/xorm``, ``github.com/gobuffalo/envy``, ``github.com/gogf/gf/database/gdb``, ``github.com/hashicorp/go-envparse``, ``github.com/jinzhu/gorm``, ``github.com/jmoiron/sqlx``, ``github.com/joho/godotenv``, ``github.com/kelseyhightower/envconfig``, ``github.com/lann/squirrel``, ``github.com/raindog308/gorqlite``, ``github.com/rqlite/gorqlite``, ``github.com/uptrace/bun``, ``go.mongodb.org/mongo-driver/mongo``, ``gopkg.in/Masterminds/squirrel``, ``gorm.io/gorm``, ``xorm.io/xorm``",194,58,391
-    Totals,,459,943,1532
+    Totals,,575,1015,1532
  • Changes to framework-coverage-go.csv:
- github.com/Masterminds/squirrel,32,,,,,,,,,,,,,,32,,,,,,,,,,,,
+ github.com/Masterminds/squirrel,32,8,1,,,,,,,,,,,,32,,,,,,8,,,,,1,
- github.com/couchbase/gocb,8,,18,,,,,8,,,,,,,,,,,,,,,,,,18,
+ github.com/couchbase/gocb,8,13,28,,,,,8,,,,,,,,,,,,,13,,,,,28,
- github.com/couchbaselabs/gocb,8,,18,,,,,8,,,,,,,,,,,,,,,,,,18,
+ github.com/couchbaselabs/gocb,8,13,28,,,,,8,,,,,,,,,,,,,13,,,,,28,
- github.com/gogf/gf/database/gdb,51,,,,,,,,,,,,,,51,,,,,,,,,,,,
+ github.com/gogf/gf/database/gdb,51,12,30,,,,,,,,,,,,51,,,,,,12,,,,,30,
- github.com/lann/squirrel,32,,,,,,,,,,,,,,32,,,,,,,,,,,,
+ github.com/lann/squirrel,32,8,1,,,,,,,,,,,,32,,,,,,8,,,,,1,
- github.com/raindog308/gorqlite,24,,,,,,,,,,,,,,24,,,,,,,,,,,,
+ github.com/raindog308/gorqlite,24,8,2,,,,,,,,,,,,24,,,,,,8,,,,,2,
- github.com/rqlite/gorqlite,24,,,,,,,,,,,,,,24,,,,,,,,,,,,
+ github.com/rqlite/gorqlite,24,8,2,,,,,,,,,,,,24,,,,,,8,,,,,2,
- github.com/uptrace/bun,63,,,,,,,,,,,,,,63,,,,,,,,,,,,
+ github.com/uptrace/bun,63,18,,,,,,,,,,,,,63,,,,,,18,,,,,,
- go.mongodb.org/mongo-driver/mongo,14,,,,,,,14,,,,,,,,,,,,,,,,,,,
+ go.mongodb.org/mongo-driver/mongo,14,7,5,,,,,14,,,,,,,,,,,,,7,,,,,5,
- gopkg.in/Masterminds/squirrel,32,,,,,,,,,,,,,,32,,,,,,,,,,,,
+ gopkg.in/Masterminds/squirrel,32,8,1,,,,,,,,,,,,32,,,,,,8,,,,,1,
- gopkg.in/couchbase/gocb,8,,18,,,,,8,,,,,,,,,,,,,,,,,,18,
+ gopkg.in/couchbase/gocb,8,13,28,,,,,8,,,,,,,,,,,,,13,,,,,28,

@egregius313 egregius313 marked this pull request as ready for review November 6, 2024 16:24
@egregius313 egregius313 requested a review from a team as a code owner November 6, 2024 16:24
@owen-mc
Copy link
Contributor

owen-mc commented Nov 7, 2024

Two tests need to have their results updated. go/ql/test/query-tests/Security/CWE-079/StoredXss.qlref also needs to have a post-processing query added for pretty-printing models, as in this PR.

I haven't had time to look at the models themselves yet. Are these new models, or are you converting models from QL? If the latter, should we also delete the QL models?

@egregius313 egregius313 force-pushed the egregius313/go/dataflow/models/database branch from 95ebab7 to 006f91b Compare November 22, 2024 21:13
@egregius313
Copy link
Contributor Author

Are these new models, or are you converting models from QL? If the latter, should we also delete the QL models?

Yes these are primarily new models. There weren't many existing database models as far as I could tell.

The one package where I'm unsure about the existing models vs the new ones are the beego-orm models. The discrepancy seems to be from API changes in v2

@egregius313 egregius313 requested a review from owen-mc December 3, 2024 03:29
Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only reviewed half of the new models so far (stdlib, beego, couchbase).

I note that we do have some existing database read models - see the source definition for the stored xss query. We should convert them to MaD, deprecate the QL classes where they are only used for that, and use the database MaD sources for that query (ignoring what the active threat model is, I think).

Extremely minor stylistic comment: within one .model.yml file I prefer to put sources first, sinks second and summaries third. This is because there are generally the fewest sources, the second fewest sinks and the most summaries. It makes it slightly easier to read and see at a glance which extensible predicates are being extended. When summaries go first, it is easy to miss that other predicates are extended below.

go/ql/lib/ext/database.sql.model.yml Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Please test for all sources defined in each package.
  • That will probably make the file a bit longer - I would split this up into a separate source file for each package, so it is easier to find particular tests.

go/ql/lib/ext/github.com.beego.beego.client.orm.model.yml Outdated Show resolved Hide resolved
Comment on lines +62 to +64
- ["group:gocb2", "QueryResult", True, "One", "", "", "Argument[0]", "database", "manual"]
- ["group:gocb2", "QueryResult", True, "Row", "", "", "Argument[0]", "database", "manual"]
Copy link
Contributor

@owen-mc owen-mc Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make more sense to model Cluster.Query and Scope.Query (which are the only way of getting a QueryResult) as the sources and model the two methods above as summary models. There is a nice symmetry that the same function is an sql-injection sink for one argument and a database source for the return value.

Also, add summary models for QueryResult.Raw and QueryResultRaw.NextBytes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that Cluster.AnalyticsQuery and Scope.AnalyticsQuery should be sources, and all the summary models for QueryResult should be duplicated for AnalyticsResult.

Comment on lines +64 to +66
- ["group:gocb2", "TransactionQueryResult", True, "One", "", "", "Argument[0]", "database", "manual"]
- ["group:gocb2", "TransactionQueryResult", True, "Row", "", "", "Argument[0]", "database", "manual"]
Copy link
Contributor

@owen-mc owen-mc Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, I think TransactionAttemptContext.Query should be the source and these should be summary models. (It should probably be an sql-injection sink as well - I've noted this down somewhere else, no need to do it in this PR.)

- ["group:gocb2", "Collection", True, "GetAndLock", "", "", "ReturnValue[0]", "database", "manual"]
- ["group:gocb2", "Collection", True, "GetAndTouch", "", "", "ReturnValue[0]", "database", "manual"]
- ["group:gocb2", "Collection", True, "GetAnyReplica", "", "", "ReturnValue[0]", "database", "manual"]
- ["group:gocb2", "Collection", True, "LookupIn", "", "", "ReturnValue[0]", "database", "manual"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we should model LookupInAllReplicas and LookupInAnyReplica as sources, and a summary model for LookupInAllReplicasResult.Next.

- ["group:gocb2", "Collection", True, "GetAndTouch", "", "", "ReturnValue[0]", "database", "manual"]
- ["group:gocb2", "Collection", True, "GetAnyReplica", "", "", "ReturnValue[0]", "database", "manual"]
- ["group:gocb2", "Collection", True, "LookupIn", "", "", "ReturnValue[0]", "database", "manual"]
- ["group:gocb2", "Collection", True, "Scan", "", "", "ReturnValue[0]", "database", "manual"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need a summary model for ScanResultItem.Content

Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done another four libraries. jmoiron/sqlx was such hard going!

go/ql/lib/ext/github.com.gogf.gf.database.gdb.model.yml Outdated Show resolved Hide resolved
go/ql/lib/ext/github.com.gogf.gf.database.gdb.model.yml Outdated Show resolved Hide resolved
go/ql/lib/ext/github.com.gogf.gf.database.gdb.model.yml Outdated Show resolved Hide resolved
go/ql/lib/ext/github.com.gogf.gf.database.gdb.model.yml Outdated Show resolved Hide resolved
- ["github.com/gogf/gf/database/gdb", "Model", True, "FindOne", "", "", "Argument[receiver]", "ReturnValue[0]", "taint", "manual"]
- ["github.com/gogf/gf/database/gdb", "Model", True, "FindValue", "", "", "Argument[receiver]", "ReturnValue[0]", "taint", "manual"]
- ["github.com/gogf/gf/database/gdb", "Model", True, "FindScan", "", "", "Argument[receiver]", "Argument[0]", "taint", "manual"]
- ["github.com/gogf/gf/database/gdb", "Model", True, "One", "", "", "Argument[receiver]", "ReturnValue[0]", "taint", "manual"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also want models for methods Scan, ScanList, Select, Struct and Structs on Model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, most of these methods on Model return a Model, which we aren't modelling at the moment. It would mean a huge number of models though. But without that, the examples they give, like db.Model("user").Where("id", 1).Scan(&user) , won't get the taint tracked through it properly. I'm not sure what to do here, to be honest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change filename to github.com.rqlite.gorqlite.model.yml

extensible: summaryModel
data:
- ["group:gorqlite", "QueryResult", True, "Map", "", "", "Argument[receiver]", "ReturnValue[0]", "taint", "manual"]
- ["group:gorqlite", "QueryResult", True, "Scan", "", "", "Argument[receiver]", "Argument[0]", "taint", "manual"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, this model won't work because dataflow out of a variadic parameter doesn't work currently. It can be modeled in QL using TaintTracking::FunctionModel.

extensible: summaryModel
data:
- ["group:gorqlite", "QueryResult", True, "Map", "", "", "Argument[receiver]", "ReturnValue[0]", "taint", "manual"]
- ["group:gorqlite", "QueryResult", True, "Scan", "", "", "Argument[receiver]", "Argument[0]", "taint", "manual"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a model for QueryResult.Slice.

pack: codeql/go-all
extensible: summaryModel
data:
- ["group:squirrel", "RowScanner", True, "Scan", "", "", "Argument[receiver]", "Argument[0]", "taint", "manual"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, this model won't work because dataflow out of a variadic parameter doesn't work currently. It can be modeled in QL using TaintTracking::FunctionModel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also needs models for QueryContextWith, QueryWith, QueryRowContextWith, QueryRowWith, Row.Scan (it seems to override the inherited Scan method from embedding RowScanner), *Builder.Query (different method signature to QueryRower.Query, unfortunately), *Builder.QueryContext, *Builder.QueryRow, *Builder.QueryRowContext, *Builder.Scan, *Builder.ScanContext. (Do we care about all the Builders? Delete? I guess safest to model them all...)

Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've finished going through all the libraries. Modeling is such hard work! Thanks for doing all this - it must have taken a long time.

Comment on lines +85 to +91
- ["github.com/uptrace/bun", "DeleteQuery", True, "Scan", "", "", "Argument[0]", "database", "manual"]
- ["github.com/uptrace/bun", "InsertQuery", True, "Scan", "", "", "Argument[0]", "database", "manual"]
- ["github.com/uptrace/bun", "MergeQuery", True, "Scan", "", "", "Argument[0]", "database", "manual"]
- ["github.com/uptrace/bun", "RawQuery", True, "Scan", "", "", "Argument[0]", "database", "manual"]
- ["github.com/uptrace/bun", "SelectQuery", True, "Scan", "", "", "Argument[0]", "database", "manual"]
- ["github.com/uptrace/bun", "TruncateQuery", True, "Scan", "", "", "Argument[0]", "database", "manual"]
- ["github.com/uptrace/bun", "UpdateQuery", True, "Scan", "", "", "Argument[0]", "database", "manual"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, this model won't work because dataflow out of a variadic parameter doesn't work currently. It can be modeled in QL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last example here shows that we need to model Exec as well (in QL). And this part of the docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These docs make me think we need source models for DB.Query* and summary models for DB.ScanRow and DB.ScanRows.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also model Collection.Aggregate and Collection.ChangeStream as sources.

go/ql/lib/ext/gorm.io.gorm.model.yml Outdated Show resolved Hide resolved
go/ql/lib/ext/database.sql.model.yml Outdated Show resolved Hide resolved
@egregius313 egregius313 force-pushed the egregius313/go/dataflow/models/database branch 3 times, most recently from 8a60eba to 16c84f7 Compare January 10, 2025 14:52
@egregius313 egregius313 force-pushed the egregius313/go/dataflow/models/database branch from 16c84f7 to bf8bed4 Compare January 10, 2025 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants