在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称(OpenSource Name):graphql/dataloader开源软件地址(OpenSource Url):https://github.com/graphql/dataloader开源编程语言(OpenSource Language):JavaScript 99.0%开源软件介绍(OpenSource Introduction):DataLoaderDataLoader is a generic utility to be used as part of your application's data fetching layer to provide a simplified and consistent API over various remote data sources such as databases or web services via batching and caching. A port of the "Loader" API originally developed by @schrockn at Facebook in 2010 as a simplifying force to coalesce the sundry key-value store back-end APIs which existed at the time. At Facebook, "Loader" became one of the implementation details of the "Ent" framework, a privacy-aware data entity loading and caching layer within web server product code. This ultimately became the underpinning for Facebook's GraphQL server implementation and type definitions. DataLoader is a simplified version of this original idea implemented in JavaScript for Node.js services. DataLoader is often used when implementing a graphql-js service, though it is also broadly useful in other situations. This mechanism of batching and caching data requests is certainly not unique to Node.js or JavaScript, it is also the primary motivation for Haxl, Facebook's data loading library for Haskell. More about how Haxl works can be read in this blog post. DataLoader is provided so that it may be useful not just to build GraphQL services for Node.js but also as a publicly available reference implementation of this concept in the hopes that it can be ported to other languages. If you port DataLoader to another language, please open an issue to include a link from this repository. Getting StartedFirst, install DataLoader using npm. npm install --save dataloader To get started, create a
BatchingBatching is not an advanced feature, it's DataLoader's primary feature. Create loaders by providing a batch loading function. const DataLoader = require('dataloader')
const userLoader = new DataLoader(keys => myBatchGetUsers(keys)) A batch loading function accepts an Array of keys, and returns a Promise which resolves to an Array of values*. Then load individual values from the loader. DataLoader will coalesce all individual loads which occur within a single frame of execution (a single tick of the event loop) and then call your batch function with all requested keys. const user = await userLoader.load(1)
const invitedBy = await userLoader.load(user.invitedByID)
console.log(`User 1 was invited by ${invitedBy}`)
// Elsewhere in your application
const user = await userLoader.load(2)
const lastInvited = await userLoader.load(user.lastInvitedID)
console.log(`User 2 last invited ${lastInvited}`) A naive application may have issued four round-trips to a backend for the required information, but with DataLoader this application will make at most two. DataLoader allows you to decouple unrelated parts of your application without sacrificing the performance of batch data-loading. While the loader presents an API that loads individual values, all concurrent requests will be coalesced and presented to your batch loading function. This allows your application to safely distribute data fetching requirements throughout your application and maintain minimal outgoing data requests. Batch FunctionA batch loading function accepts an Array of keys, and returns a Promise which
resolves to an Array of values or Error instances. The loader itself is provided
as the async function batchFunction(keys) {
const results = await db.fetchAllKeys(keys)
return keys.map(key => results[key] || new Error(`No result for ${key}`))
}
const loader = new DataLoader(batchFunction) There are a few constraints this function must uphold:
For example, if your batch function was provided the Array of keys: { id: 9, name: 'Chicago' }
{ id: 1, name: 'New York' }
{ id: 2, name: 'San Francisco' } Our back-end service returned results in a different order than we requested, likely
because it was more efficient for it to do so. Also, it omitted a result for key To uphold the constraints of the batch function, it must return an Array of values
the same length as the Array of keys, and re-order them to ensure each index aligns
with the original keys [
{ id: 2, name: 'San Francisco' },
{ id: 9, name: 'Chicago' },
null, // or perhaps `new Error()`
{ id: 1, name: 'New York' }
] Batch SchedulingBy default DataLoader will coalesce all individual loads which occur within a
single frame of execution before calling your batch function with all requested
keys. This ensures no additional latency while capturing many related requests
into a single batch. In fact, this is the same behavior used in Facebook's
original PHP implementation in 2010. See However sometimes this behavior is not desirable or optimal. Perhaps you expect
requests to be spread out over a few subsequent ticks because of an existing use
of A custom scheduler is provided as As an example, here is a batch scheduler which collects all requests over a 100ms window of time (and as a consequence, adds 100ms of latency): const myLoader = new DataLoader(myBatchFn, {
batchScheduleFn: callback => setTimeout(callback, 100)
}) As another example, here is a manually dispatched batch scheduler: function createScheduler() {
let callbacks = []
return {
schedule(callback) {
callbacks.push(callback)
},
dispatch() {
callbacks.forEach(callback => callback())
callbacks = []
}
}
}
const { schedule, dispatch } = createScheduler()
const myLoader = new DataLoader(myBatchFn, { batchScheduleFn: schedule })
myLoader.load(1)
myLoader.load(2)
dispatch() CachingDataLoader provides a memoization cache for all loads which occur in a single
request to your application. After Caching Per-RequestDataLoader caching does not replace Redis, Memcache, or any other shared
application-level cache. DataLoader is first and foremost a data loading mechanism,
and its cache only serves the purpose of not repeatedly loading the same data in
the context of a single request to your Application. To do this, it maintains a
simple in-memory memoization cache (more accurately: Avoid multiple requests from different users using the DataLoader instance, which could result in cached data incorrectly appearing in each request. Typically, DataLoader instances are created when a Request begins, and are not used once the Request ends. For example, when using with express: function createLoaders(authToken) {
return {
users: new DataLoader(ids => genUsers(authToken, ids)),
}
}
const app = express()
app.get('/', function(req, res) {
const authToken = authenticateUser(req)
const loaders = createLoaders(authToken)
res.send(renderPage(req, loaders))
})
app.listen() Caching and BatchingSubsequent calls to In the example below, User userLoader.prime(1, { bestFriend: 3 })
async function getBestFriend(userID) {
const user = await userLoader.load(userID)
return await userLoader.load(user.bestFriendID)
}
// In one part of your application
getBestFriend(1)
// Elsewhere
getBestFriend(2) Without this optimization, if the cached User Clearing CacheIn certain uncommon cases, clearing the request cache may be necessary. The most common example when clearing the loader's cache is necessary is after a mutation or update within the same request, when a cached value could be out of date and future loads should not use any possibly cached value. Here's a simple example using SQL UPDATE to illustrate. // Request begins...
const userLoader = new DataLoader(...)
// And a value happens to be loaded (and cached).
const user = await userLoader.load(4)
// A mutation occurs, invalidating what might be in cache.
await sqlRun('UPDATE users WHERE id=4 SET username="zuck"')
userLoader.clear(4)
// Later the value load is loaded again so the mutated data appears.
const user = await userLoader.load(4)
// Request completes. Caching ErrorsIf a batch load fails (that is, a batch function throws or returns a rejected
Promise), then the requested values will not be cached. However if a batch
function returns an In some circumstances you may wish to clear the cache for these individual Errors: try {
const user = await userLoader.load(1)
} catch (error) {
if (/* determine if the error should not be cached */) {
userLoader.clear(1)
}
throw error
} Disabling CacheIn certain uncommon cases, a DataLoader which does not cache may be desirable.
Calling However, when the memoization cache is disabled, your batch function will
receive an array of keys which may contain duplicates! Each key will be
associated with each call to For example: const myLoader = new DataLoader(keys => {
console.log(keys)
return someBatchLoadFn(keys)
}, { cache: false })
myLoader.load('A')
myLoader.load('B')
myLoader.load('A')
// > [ 'A', 'B', 'A' ] More complex cache behavior can be achieved by calling const myLoader = new DataLoader(keys => {
myLoader.clearAll()
return someBatchLoadFn(keys)
}) Custom CacheAs mentioned above, DataLoader is intended to be used as a per-request cache. Since requests are short-lived, DataLoader uses an infinitely growing Map as a memoization cache. This should not pose a problem as most requests are short-lived and the entire cache can be discarded after the request completes. However this memoization caching strategy isn't safe when using a long-lived DataLoader, since it could consume too much memory. If using DataLoader in this way, you can provide a custom Cache instance with whatever behavior you prefer, as long as it follows the same API as Map. The example below uses an LRU (least recently used) cache to limit total memory to hold at most 100 cached values via the lru_map npm package. import { LRUMap } from 'lru_map'
const myLoader = new DataLoader(someBatchLoadFn, {
cacheMap: new LRUMap(100)
}) More specifically, any object that implements the methods APIclass DataLoaderDataLoader creates a public API for loading data from a particular
data back-end with unique keys such as the Each
|
Option Key | Type | Default | Description |
---|---|---|---|
batch | Boolean | true |
Set to false to disable batching, invoking batchLoadFn with a single load key. This is equivalent to setting maxBatchSize to 1 . |
maxBatchSize | Number | Infinity |
Limits the number of items that get passed in to the batchLoadFn . May be set to 1 to disable batching. |
batchScheduleFn | Function | See Batch scheduling | A function to schedule the later execution of a batch. The function is expected to call the provided callback in the immediate future. |
cache | Boolean | true |
Set to false to disable memoization caching, creating a new Promise and new key in the batchLoadFn for every load of the same key. This is equivalent to setting cacheMap to null . |
cacheKeyFn | Function | key => key |
Produces cache key for a given load key. Useful when objects are keys and two objects should be considered equivalent. |
cacheMap | Object | new Map() |
Instance of Map (or an object with a similar API) to be used as cache. May be set to null to disable caching. |
load(key)
Loads a key, returning a Promise
for the value represented by that key.
loadMany(keys)
Loads multiple keys, promising an array of values:
const [ a, b ] = await myLoader.loadMany([ 'a', 'b' ])
This is similar to the more verbose:
const [ a, b ] = await Promise.all([
myLoader.load('a'),
myLoader.load('b')
])
However it is different in the case where any load fails. Where Promise.all() would reject, loadMany() always resolves, however each result is either a value or an Error instance.
var [ a, b, c ] = await myLoader.loadMany([ 'a', 'b', 'badkey' ]);
// c instanceof Error
clear(key)
Clears the value at key
from the cache, if it exists. Returns itself for
method chaining.
clearAll()
Clears the entire cache. To be used when some event results in unknown
invalidations across this particular DataLoader
. Returns itself for
method chaining.
prime(key, value)
Primes the cache with the provided key and value. If the key already exists, no
change is made. (To forcefully prime the cache, clear the key first with
loader.clear(key).prime(key, value)
.) Returns itself for method chaining.
To prime the cache with an error at a key, provide an Error instance.
DataLoader pairs nicely well with GraphQL. GraphQL fields are designed to be stand-alone functions. Without a caching or batching mechanism, it's easy for a naive GraphQL server to issue new database requests each time a field is resolved.
Consider the following GraphQL request:
{
me {
name
bestFriend {
name
}
friends(first: 5) {
name
bestFriend {
name
}
}
}
}
Naively, if me
, bestFriend
and friends
each need to request the backend,
there could be at most 13 database requests!
When using DataLoader, we could define the User
type using the
SQLite example with clearer code and at most 4 database requests,
and possibly fewer if there are cache hits.
const UserType = new GraphQLObjectType({
name: 'User',
fields: () => ({
name: { type: GraphQLString },
bestFriend: {
type: UserType,
resolve: user => userLoader.load(user.bestFriendID)
},
friends: {
args: {
first: { type: GraphQLInt }
},
type: new GraphQLList(UserType),
resolve: async (user, { first }) => {
const rows = await queryLoader.load([
'SELECT toID FROM friends WHERE fromID=? LIMIT ?', user.id, first
])
return rows.map(row => userLoader.load(row.toID))
}
}
})
})
In many applications, a web server using DataLoader serves requests to many different users with different access permissions. It may be dangerous to use one cache across many users, and is encouraged to create a new DataLoader per request:
function createLoaders(authToken) {
return {
users: new DataLoader(ids => genUsers
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论