Sign Up
Log In
Log In
or
Sign Up
Places
All Projects
Status Monitor
Collapse sidebar
SUSE:SLE-12-SP1:Update
xen.7653
557eb620-gnttab-make-the-grant-table-lock-a-rea...
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
File 557eb620-gnttab-make-the-grant-table-lock-a-read-write-lock.patch of Package xen.7653
# Commit 40de9fffb4cc0b0485aa3391d72e2220b8e1ce12 # Date 2015-06-15 13:25:20 +0200 # Author David Vrabel <david.vrabel@citrix.com> # Committer Jan Beulich <jbeulich@suse.com> gnttab: make the grant table lock a read-write lock In combination with the per-active entry locks, the grant table lock can be made a read-write lock since the majority of cases only the read lock is required. The grant table read lock protects against changes to the table version or size (which are done with the write lock held). The write lock is also required when two active entries must be acquired. The double lock is still required when updating IOMMU page tables. With the lock contention being only on the maptrack lock (unless IOMMU updates are required), performance and scalability is improved. Based on a patch originally by Matt Wilson <msw@amazon.com>. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> --- a/docs/misc/grant-tables.txt +++ b/docs/misc/grant-tables.txt @@ -83,7 +83,7 @@ is complete. ~~~~~~~ Xen uses several locks to serialize access to the internal grant table state. - grant_table->lock : lock used to prevent readers from accessing + grant_table->lock : rwlock used to prevent readers from accessing inconsistent grant table state such as current version, partially initialized active table pages, etc. @@ -91,34 +91,43 @@ is complete. active_grant_entry->lock : spinlock used to serialize modifications to active entries - The primary lock for the grant table is a spinlock. All functions - that access members of struct grant_table must acquire the lock - around critical sections. + The primary lock for the grant table is a read/write spinlock. All + functions that access members of struct grant_table must acquire a + read lock around critical sections. Any modification to the members + of struct grant_table (e.g., nr_status_frames, nr_grant_frames, + active frames, etc.) must only be made if the write lock is + held. These elements are read-mostly, and read critical sections can + be large, which makes a rwlock a good choice. The maptrack free list is protected by its own spinlock. The maptrack lock may be locked while holding the grant table lock. Active entries are obtained by calling active_entry_acquire(gt, ref). This function returns a pointer to the active entry after locking its - spinlock. The caller must hold the grant table lock for the gt in - question before calling active_entry_acquire(). This is because the - grant table can be dynamically extended via gnttab_grow_table() while - a domain is running and must be fully initialized. Once all access to - the active entry is complete, release the lock by calling - active_entry_release(act). + spinlock. The caller must hold the grant table read lock before + calling active_entry_acquire(). This is because the grant table can + be dynamically extended via gnttab_grow_table() while a domain is + running and must be fully initialized. Once all access to the active + entry is complete, release the lock by calling active_entry_release(act). Summary of rules for locking: active_entry_acquire() and active_entry_release() can only be - called when holding the relevant grant table's lock. I.e.: - spin_lock(>->lock); + called when holding the relevant grant table's read lock. I.e.: + read_lock(>->lock); act = active_entry_acquire(gt, ref); ... active_entry_release(act); - spin_unlock(>->lock); + read_unlock(>->lock); Active entries cannot be acquired while holding the maptrack lock. Multiple active entries can be acquired while holding the grant table - lock. + _write_ lock. + + Maptrack entries are protected by the corresponding active entry + lock. As an exception, new maptrack entries may be populated without + holding the lock, provided the flags field is written last. This + requires any maptrack entry user validates the flags field as + non-zero first. ******************************************************************************** --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1041,7 +1041,7 @@ int xenmem_add_to_physmap_one( switch ( space ) { case XENMAPSPACE_grant_table: - spin_lock(&d->grant_table->lock); + write_lock(&d->grant_table->lock); if ( d->grant_table->gt_version == 0 ) d->grant_table->gt_version = 1; @@ -1076,7 +1076,7 @@ int xenmem_add_to_physmap_one( mfn, 0); if ( rc ) { - spin_unlock(&d->grant_table->lock); + write_unlock(&d->grant_table->lock); return rc; } } @@ -1088,7 +1088,7 @@ int xenmem_add_to_physmap_one( t = p2m_ram_rw; } - spin_unlock(&d->grant_table->lock); + write_unlock(&d->grant_table->lock); if ( mfn == INVALID_MFN ) return -EINVAL; --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4920,7 +4920,7 @@ int xenmem_add_to_physmap_one( mfn = virt_to_mfn(d->shared_info); break; case XENMAPSPACE_grant_table: - spin_lock(&d->grant_table->lock); + write_lock(&d->grant_table->lock); if ( d->grant_table->gt_version == 0 ) d->grant_table->gt_version = 1; @@ -4942,7 +4942,7 @@ int xenmem_add_to_physmap_one( mfn = virt_to_mfn(d->grant_table->shared_raw[idx]); } - spin_unlock(&d->grant_table->lock); + write_unlock(&d->grant_table->lock); break; case XENMAPSPACE_gmfn_range: case XENMAPSPACE_gmfn: --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -196,7 +196,7 @@ active_entry_acquire(struct grant_table { struct active_grant_entry *act; - ASSERT(spin_is_locked(&t->lock)); + ASSERT(rw_is_locked(&t->lock)); act = &_active_entry(t, e); spin_lock(&act->lock); @@ -252,25 +252,29 @@ static int __get_paged_frame(unsigned lo static inline void double_gt_lock(struct grant_table *lgt, struct grant_table *rgt) { + /* + * See mapcount() for why the write lock is also required for the + * remote domain. + */ if ( lgt < rgt ) { - spin_lock(&lgt->lock); - spin_lock(&rgt->lock); + write_lock(&lgt->lock); + write_lock(&rgt->lock); } else { if ( lgt != rgt ) - spin_lock(&rgt->lock); - spin_lock(&lgt->lock); + write_lock(&rgt->lock); + write_lock(&lgt->lock); } } static inline void double_gt_unlock(struct grant_table *lgt, struct grant_table *rgt) { - spin_unlock(&lgt->lock); + write_unlock(&lgt->lock); if ( lgt != rgt ) - spin_unlock(&rgt->lock); + write_unlock(&rgt->lock); } static inline int @@ -528,7 +532,7 @@ static int grant_map_exists(const struct { unsigned int ref, max_iter; - ASSERT(spin_is_locked(&rgt->lock)); + ASSERT(rw_is_locked(&rgt->lock)); max_iter = min(*ref_count + (1 << GNTTABOP_CONTINUATION_ARG_SHIFT), nr_grant_entries(rgt)); @@ -568,15 +572,15 @@ static void mapcount( *wrc = *rdc = 0; /* - * Must have the local domain's grant table lock when iterating - * over its maptrack entries. + * Must have the local domain's grant table write lock when + * iterating over its maptrack entries. */ - ASSERT(spin_is_locked(&lgt->lock)); + ASSERT(rw_is_write_locked(&lgt->lock)); /* - * Must have the remote domain's grant table lock while counting - * its active entries. + * Must have the remote domain's grant table write lock while + * counting its active entries. */ - ASSERT(spin_is_locked(&rd->grant_table->lock)); + ASSERT(rw_is_write_locked(&rd->grant_table->lock)); for ( handle = 0; handle < lgt->maptrack_limit; handle++ ) { @@ -617,6 +621,7 @@ __gnttab_map_grant_ref( grant_entry_v2_t *sha2; grant_entry_header_t *shah; uint16_t *status; + bool_t need_iommu; led = current; ld = led->domain; @@ -662,7 +667,7 @@ __gnttab_map_grant_ref( } rgt = rd->grant_table; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); if ( rgt->gt_version == 0 ) PIN_FAIL(unlock_out, GNTST_general_error, @@ -736,7 +741,7 @@ __gnttab_map_grant_ref( cache_flags = (shah->flags & (GTF_PAT | GTF_PWT | GTF_PCD) ); active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); /* pg may be set, with a refcount included, from __get_paged_frame */ if ( !pg ) @@ -831,12 +836,14 @@ __gnttab_map_grant_ref( goto undo_out; } - double_gt_lock(lgt, rgt); - - if ( gnttab_need_iommu_mapping(ld) ) + need_iommu = gnttab_need_iommu_mapping(ld); + if ( need_iommu ) { unsigned int wrc, rdc; int err = 0; + + double_gt_lock(lgt, rgt); + /* We're not translated, so we know that gmfns and mfns are the same things, so the IOMMU entry is always 1-to-1. */ mapcount(lgt, rd, frame, &wrc, &rdc); @@ -862,12 +869,22 @@ __gnttab_map_grant_ref( TRACE_1D(TRC_MEM_PAGE_GRANT_MAP, op->dom); + /* + * All maptrack entry users check mt->flags first before using the + * other fields so just ensure the flags field is stored last. + * + * However, if gnttab_need_iommu_mapping() then this would race + * with a concurrent mapcount() call (on an unmap, for example) + * and a lock is required. + */ mt = &maptrack_entry(lgt, handle); mt->domid = op->dom; mt->ref = op->ref; - mt->flags = op->flags; + wmb(); + write_atomic(&mt->flags, op->flags); - double_gt_unlock(lgt, rgt); + if ( need_iommu ) + double_gt_unlock(lgt, rgt); op->dev_bus_addr = (u64)frame << PAGE_SHIFT; op->handle = handle; @@ -883,7 +900,7 @@ __gnttab_map_grant_ref( while ( refcnt-- ) put_page(pg); - spin_lock(&rgt->lock); + read_lock(&rgt->lock); act = active_entry_acquire(rgt, op->ref); @@ -906,7 +923,7 @@ __gnttab_map_grant_ref( active_entry_release(act); unlock_out: - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); op->status = rc; put_maptrack_handle(lgt, handle); rcu_unlock_domain(rd); @@ -959,18 +976,18 @@ __gnttab_unmap_common( smp_rmb(); map = &maptrack_entry(lgt, op->handle); - spin_lock(&lgt->lock); + read_lock(&lgt->lock); - if ( unlikely(!map->flags) ) + if ( unlikely(!read_atomic(&map->flags)) ) { - spin_unlock(&lgt->lock); + read_unlock(&lgt->lock); gdprintk(XENLOG_INFO, "Zero flags for handle (%d).\n", op->handle); op->status = GNTST_bad_handle; return; } dom = map->domid; - spin_unlock(&lgt->lock); + read_unlock(&lgt->lock); if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) ) { @@ -991,7 +1008,7 @@ __gnttab_unmap_common( TRACE_1D(TRC_MEM_PAGE_GRANT_UNMAP, dom); rgt = rd->grant_table; - double_gt_lock(lgt, rgt); + read_lock(&rgt->lock); if ( rgt->gt_version == 0 ) { @@ -1009,7 +1026,24 @@ __gnttab_unmap_common( op->rd = rd; op->ref = map->ref; - act = active_entry_acquire(rgt, map->ref); + + /* + * We can't assume there was no racing unmap for this maptrack entry, + * and hence we can't assume map->ref is valid for rd. While the checks + * below (with the active entry lock held) will reject any such racing + * requests, we still need to make sure we don't attempt to acquire an + * invalid lock. + */ + smp_rmb(); + if ( unlikely(op->ref >= nr_grant_entries(rgt)) ) + { + gdprintk(XENLOG_WARNING, "Unstable handle %#x\n", op->handle); + rc = GNTST_bad_handle; + flags = 0; + goto unmap_out; + } + + act = active_entry_acquire(rgt, op->ref); /* * Note that we (ab)use the active entry lock here to protect against @@ -1019,8 +1053,9 @@ __gnttab_unmap_common( * hold anyway; see docs/misc/grant-tables.txt's "Locking" section. */ - flags = map->flags; - if ( unlikely(!flags) || unlikely(map->domid != dom) ) + flags = read_atomic(&map->flags); + if ( unlikely(!flags) || unlikely(map->domid != dom) || + unlikely(map->ref != op->ref) ) { gdprintk(XENLOG_WARNING, "Unstable handle %u\n", op->handle); rc = GNTST_bad_handle; @@ -1061,7 +1096,7 @@ __gnttab_unmap_common( act_release_out: active_entry_release(act); unmap_out: - double_gt_unlock(lgt, rgt); + read_unlock(&rgt->lock); if ( put_handle ) put_maptrack_handle(lgt, op->handle); @@ -1113,7 +1148,7 @@ __gnttab_unmap_common_complete(struct gn rcu_lock_domain(rd); rgt = rd->grant_table; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); act = active_entry_acquire(rgt, op->ref); sha = shared_entry_header(rgt, op->ref); @@ -1167,7 +1202,7 @@ __gnttab_unmap_common_complete(struct gn gnttab_clear_flag(_GTF_reading, status); active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); rcu_unlock_domain(rd); } @@ -1402,11 +1437,13 @@ gnttab_unpopulate_status_frames(struct d return 0; } +/* + * Grow the grant table. The caller must hold the grant table's + * write lock before calling this function. + */ int gnttab_grow_table(struct domain *d, unsigned int req_nr_frames) { - /* d's grant table lock must be held by the caller */ - struct grant_table *gt = d->grant_table; unsigned int i, j; @@ -1512,7 +1549,7 @@ gnttab_setup_table( } gt = d->grant_table; - spin_lock(>->lock); + write_lock(>->lock); if ( gt->gt_version == 0 ) gt->gt_version = 1; @@ -1540,7 +1577,7 @@ gnttab_setup_table( } out3: - spin_unlock(>->lock); + write_unlock(>->lock); out2: rcu_unlock_domain(d); out1: @@ -1582,13 +1619,13 @@ gnttab_query_size( goto query_out_unlock; } - spin_lock(&d->grant_table->lock); + read_lock(&d->grant_table->lock); op.nr_frames = nr_grant_frames(d->grant_table); op.max_nr_frames = max_grant_frames; op.status = GNTST_okay; - spin_unlock(&d->grant_table->lock); + read_unlock(&d->grant_table->lock); query_out_unlock: @@ -1614,7 +1651,7 @@ gnttab_prepare_for_transfer( union grant_combo scombo, prev_scombo, new_scombo; int retries = 0; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); if ( rgt->gt_version == 0 ) { @@ -1665,11 +1702,11 @@ gnttab_prepare_for_transfer( scombo = prev_scombo; } - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); return 1; fail: - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); return 0; } @@ -1684,6 +1721,7 @@ gnttab_transfer( struct gnttab_transfer gop; unsigned long mfn; unsigned int max_bitsize; + struct active_grant_entry *act; for ( i = 0; i < count; i++ ) { @@ -1859,7 +1897,8 @@ gnttab_transfer( TRACE_1D(TRC_MEM_PAGE_GRANT_TRANSFER, e->domain_id); /* Tell the guest about its new page frame. */ - spin_lock(&e->grant_table->lock); + read_lock(&e->grant_table->lock); + act = active_entry_acquire(e->grant_table, gop.ref); if ( e->grant_table->gt_version == 1 ) { @@ -1877,7 +1916,8 @@ gnttab_transfer( shared_entry_header(e->grant_table, gop.ref)->flags |= GTF_transfer_completed; - spin_unlock(&e->grant_table->lock); + active_entry_release(act); + read_unlock(&e->grant_table->lock); rcu_unlock_domain(e); @@ -1910,7 +1950,7 @@ __release_grant_for_copy( grant_ref_t trans_gref; struct domain *td; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); act = active_entry_acquire(rgt, gref); sha = shared_entry_header(rgt, gref); @@ -1946,7 +1986,7 @@ __release_grant_for_copy( gnttab_clear_flag(_GTF_reading, status); active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); if ( td != rd ) { @@ -2003,7 +2043,7 @@ __acquire_grant_for_copy( *page = NULL; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); if ( rgt->gt_version == 0 ) PIN_FAIL(gt_unlock_out, GNTST_general_error, @@ -2076,13 +2116,13 @@ __acquire_grant_for_copy( * here and reacquire. */ active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); rc = __acquire_grant_for_copy(td, trans_gref, rd->domain_id, readonly, &grant_frame, page, &trans_page_off, &trans_length, 0); - spin_lock(&rgt->lock); + read_lock(&rgt->lock); act = active_entry_acquire(rgt, gref); if ( rc != GNTST_okay ) @@ -2090,7 +2130,7 @@ __acquire_grant_for_copy( __fixup_status_for_copy_pin(act, status); rcu_unlock_domain(td); active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); return rc; } @@ -2112,7 +2152,7 @@ __acquire_grant_for_copy( __fixup_status_for_copy_pin(act, status); rcu_unlock_domain(td); active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); put_page(*page); *page = NULL; return ERESTART; @@ -2217,7 +2257,7 @@ __acquire_grant_for_copy( *frame = act->frame; active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); return rc; unlock_out_clear: @@ -2232,7 +2272,7 @@ __acquire_grant_for_copy( active_entry_release(act); gt_unlock_out: - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); return rc; } @@ -2563,7 +2603,7 @@ gnttab_set_version(XEN_GUEST_HANDLE_PARA if ( gt->gt_version == op.version ) goto out; - spin_lock(>->lock); + write_lock(>->lock); /* Make sure that the grant table isn't currently in use when we change the version number, except for the first 8 entries which are allowed to be in use (xenstore/xenconsole keeps them mapped). @@ -2670,7 +2710,7 @@ gnttab_set_version(XEN_GUEST_HANDLE_PARA gt->gt_version = op.version; out_unlock: - spin_unlock(>->lock); + write_unlock(>->lock); out: op.version = gt->gt_version; @@ -2726,7 +2766,7 @@ gnttab_get_status_frames(XEN_GUEST_HANDL op.status = GNTST_okay; - spin_lock(>->lock); + read_lock(>->lock); for ( i = 0; i < op.nr_frames; i++ ) { @@ -2735,7 +2775,7 @@ gnttab_get_status_frames(XEN_GUEST_HANDL op.status = GNTST_bad_virt_addr; } - spin_unlock(>->lock); + read_unlock(>->lock); out2: rcu_unlock_domain(d); out1: @@ -2785,7 +2825,7 @@ __gnttab_swap_grant_ref(grant_ref_t ref_ struct active_grant_entry *act_b = NULL; s16 rc = GNTST_okay; - spin_lock(>->lock); + write_lock(>->lock); if ( gt->gt_version == 0 ) PIN_FAIL(out, GNTST_general_error, "grant table not yet set up\n"); @@ -2836,7 +2876,7 @@ out: active_entry_release(act_b); if ( act_a != NULL ) active_entry_release(act_a); - spin_unlock(>->lock); + write_unlock(>->lock); rcu_unlock_domain(d); @@ -2907,12 +2947,12 @@ static int __gnttab_cache_flush(gnttab_c if ( d != owner ) { - spin_lock(&owner->grant_table->lock); + read_lock(&owner->grant_table->lock); ret = grant_map_exists(d, owner->grant_table, mfn, ref_count); if ( ret != 0 ) { - spin_unlock(&owner->grant_table->lock); + read_unlock(&owner->grant_table->lock); rcu_unlock_domain(d); put_page(page); return ret; @@ -2932,7 +2972,7 @@ static int __gnttab_cache_flush(gnttab_c ret = 0; if ( d != owner ) - spin_unlock(&owner->grant_table->lock); + read_unlock(&owner->grant_table->lock); unmap_domain_page(v); put_page(page); @@ -3152,7 +3192,7 @@ grant_table_create( goto no_mem_0; /* Simple stuff. */ - spin_lock_init(&t->lock); + rwlock_init(&t->lock); spin_lock_init(&t->maptrack_lock); t->nr_grant_frames = INITIAL_NR_GRANT_FRAMES; @@ -3262,7 +3302,7 @@ gnttab_release_mappings( } rgt = rd->grant_table; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); act = active_entry_acquire(rgt, ref); sha = shared_entry_header(rgt, ref); @@ -3325,7 +3365,7 @@ gnttab_release_mappings( gnttab_clear_flag(_GTF_reading, status); active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); rcu_unlock_domain(rd); @@ -3373,7 +3413,7 @@ static void gnttab_usage_print(struct do printk(" -------- active -------- -------- shared --------\n"); printk("[ref] localdom mfn pin localdom gmfn flags\n"); - spin_lock(>->lock); + read_lock(>->lock); if ( gt->gt_version == 0 ) goto out; @@ -3426,7 +3466,7 @@ static void gnttab_usage_print(struct do } out: - spin_unlock(>->lock); + read_unlock(>->lock); if ( first ) printk("grant-table for remote domain:%5d ... " --- a/xen/include/xen/grant_table.h +++ b/xen/include/xen/grant_table.h @@ -64,6 +64,11 @@ struct grant_mapping { /* Per-domain grant information. */ struct grant_table { + /* + * Lock protecting updates to grant table state (version, active + * entry list, etc.) + */ + rwlock_t lock; /* Table size. Number of frames shared with guest */ unsigned int nr_grant_frames; /* Shared grant table (see include/public/grant_table.h). */ @@ -84,8 +89,6 @@ struct grant_table { unsigned int maptrack_limit; /* Lock protecting the maptrack page list, head, and limit */ spinlock_t maptrack_lock; - /* Lock protecting updates to active and shared grant tables. */ - spinlock_t lock; /* The defined versions are 1 and 2. Set to 0 if we don't know what version to use yet. */ unsigned gt_version; @@ -103,7 +106,7 @@ gnttab_release_mappings( struct domain *d); /* Increase the size of a domain's grant table. - * Caller must hold d's grant table lock. + * Caller must hold d's grant table write lock. */ int gnttab_grow_table(struct domain *d, unsigned int req_nr_frames);
Locations
Projects
Search
Status Monitor
Help
OpenBuildService.org
Documentation
API Documentation
Code of Conduct
Contact
Support
@OBShq
Terms
openSUSE Build Service is sponsored by
The Open Build Service is an
openSUSE project
.
Sign Up
Log In
Places
Places
All Projects
Status Monitor